My hobbyist coding updates and releases as the mysterious "Mr. Tines"

Friday, 24 August 2007

Selection Factors

I had started porting CTClib to managed C++ to do an Iron<something> piecemeal conversion. But at the moment IronPython doesn't have a good deployment story, IronRuby isn't all there; and I still much prefer Java's UI model to WinForms or WPF… So what about JRuby -- familiar UI style and run from jar -- then?

The stumbling block as always is PhilZ's choice of deflate with a 2^13 bit window (as opposed to Zlib's fixed 2^15 bit window size for Ruby or java.util.zip) for compression. This was where I paused my first Java port -- JZlib which could do the job has appeared since I last looked, around the turn of the century. Even so, for sake of portability I've decided to bite the bullet, and do a minimal zlib/deflate implementation in Ruby for the purpose, to do something meaningful with the language, using JZlib as a guide. Inflate can, of course, have the larger window (as in current CTClib-C builds), and use the built-in version, be it the C version from native Ruby or JRuby's java.util.zip wrapper.

So, the easy bit first -- Adler 32 checksum…

module Com_ravnaandtines
  module Zlib
    # largest prime smaller than 65536
    ADLER_BASE = 65521
    # Adler32 checksum : takes a seed (usually 1), and a byte sequence, 
    # returns 32-bit integer
    def adler32(adler, buffer)
      if not buffer
        return 1
      end
      
      ## string to array
      ## otherwise expect an each method to yield integers
      if buffer.respond_to? :unpack
        buffer = buffer.unpack("C*")
      end
      
      ## build up the checksum
      low = adler & 0xffff
      high = (adler >> 16) & 0xffff
      buffer.each do |x| 
        low += (x.to_i & 0xff)
        high += low
      end

      ## collapse into modular parts
      low %= ADLER_BASE
      high %= ADLER_BASE
      (high << 16) | low
    end
  end
end

Test vectors for the unit tests had to be scavenged from the internet:

require 'test/unit'
require 'tinesware_zlib'
include Com_ravnaandtines::Zlib

class AdlerTest < Test::Unit::TestCase

  def basic_test_engine(seq, expected)
    a = Com_ravnaandtines::Zlib.adler32(1, seq)
    assert_equal(expected, a)
  end

  def test_boundary
    a = Com_ravnaandtines::Zlib.adler32(0, nil)
    assert_equal(1, a)    
  end

  def test_simple_0
    basic_test_engine("Mark Adler", 0x13070394)
  end
  
  def test_simple_1
    basic_test_engine("\x00\x01\x02\x03", 0x000e0007)
  end
 
  def test_simple_2
    basic_test_engine("\x00\x01\x02\x03\x04\x05\x06\x07",  0x005c001d)
  end

  def test_simple_3
    basic_test_engine("\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f", 0x02b80079)
  end

  def test_simple_4
    basic_test_engine("\x41\x41\x41\x41", 0x028e0105)
  end

  def test_simple_5
    basic_test_engine("\x42\x42\x42\x42\x42\x42\x42\x42", 0x09500211)
  end

  def test_simple_6
    basic_test_engine("\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43", 0x23a80431)
  end

  class Vector #arrays filled with value = (byte) index
    def initialize(size)
      @size = size
    end
    def each
      index = 0
      while index < @size
        yield index & 0xff
        index += 1
      end
    end
  end

  def test_total
    index = 0
    results = [  486795068,
                1525910894,
                3543032800,
                2483946130,
                4150712693,
                3878123687,
                3650897945,
                1682829244,
                1842395054,
                 460416992,
                3287492690,
                 479453429,
                3960773095,
                2008242969,
                4130540683,
                1021367854,
                4065361952,
                2081116754,
                4033606837,
                1162071911 ]

    
    while index < 20
      size = 5*index + 1
      xx = Vector.new(1000*size)
      basic_test_engine(xx, results[index])
      index += 1
    end
  end

end

That was one evening. Now, how long will the rest of it take?

No comments: