Re: pass compressed string

2011-02-25 Thread Andi Vajda


On Feb 25, 2011, at 5:57, Roman Chyla  wrote:


Hi Andi,

Thanks, the JArray_byte() does what I needed - I was (wrongly) passing
bytestring (which I think got automatically converted to unicode) and
trying to get bytes of that string was not correct.

Though it would be interesting to find out if it is possible to pass
string and get the bytes in java,


A Java String is not made of bytes but 16-bit unicode chars. If I  
remember correctly, the String.getBytes() method is deprecated in Java  
because of encoding issues. Whenever a Python string (type str, made  
of bytes) is passed to Java, it is assumed to be encoded utf-8 and  
converted to 16-bit unicode on the fly.


Andi..


I don't know if what conversion
happening on the jni side, or only in java - i shall do some reading

Example in python:

In [4]: s = zlib.compress("python")

In [5]: repr(s)
Out[5]: "'x\\x9c+\\xa8,\\xc9\\xc8\\xcf\\x03\\x00\\tW\\x02\\xa3'"

In [6]: lucene.JArray_byte(s)
Out[6]: JArray(120, -100, 43, -88, 44, -55, -56, -49, 3, 0, 9,  
87, 2, -93)


The same thing in Jython:


s = zlib.compress("python")
s

'x\x9c+\xa8,\xc9\xc8\xcf\x03\x00\tW\x02\xa3'

repr(s)

"'x\\x9c+\\xa8,\\xc9\\xc8\\xcf\\x03\\x00\\tW\\x02\\xa3'"

String(s).getBytes()

array('b', [120, -62, -100, 43, -62, -88, 44, -61, -119, -61, -120,
-61, -113, 3, 0, 9, 87, 2, -62, -93])

String(s).getBytes('utf8')

array('b', [120, -62, -100, 43, -62, -88, 44, -61, -119, -61, -120,
-61, -113, 3, 0, 9, 87, 2, -62, -93])

String(s).getBytes('utf16')

array('b', [-2, -1, 0, 120, 0, -100, 0, 43, 0, -88, 0, 44, 0, -55, 0,
-56, 0, -49, 0, 3, 0, 0, 0, 9, 0, 87, 0, 2, 0, -93])

String(s).getBytes('ascii')

array('b', [120, 63, 43, 63, 44, 63, 63, 63, 3, 0, 9, 87, 2, 63])




Roman

On Thu, Feb 24, 2011 at 3:42 AM, Andi Vajda  wrote:


On Thu, 24 Feb 2011, Roman Chyla wrote:


I would like to transfer results from python to java:

hello = zlib.compress("hello")

on the java side do:

byte[] data = string.getBytes()

But I am not successful. Is there any translation going on  
somewhere?


Can you be more specific ?
Actual lines of code, errors, expected results, actual results...

An array of bytes in JCC is not created with a string but a
JArray('byte')(len or str)

 >>> import lucene
 >>> lucene.initVM()
 
 >>> lucene.JArray('byte')(10)
 JArray(0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
 >>> lucene.JArray('byte')("abcd")
 JArray(97, 98, 99, 100)
 >>>

Andi..



Re: pass compressed string

2011-02-25 Thread Roman Chyla
Hi Andi,

Thanks, the JArray_byte() does what I needed - I was (wrongly) passing
bytestring (which I think got automatically converted to unicode) and
trying to get bytes of that string was not correct.

Though it would be interesting to find out if it is possible to pass
string and get the bytes in java, I don't know if what conversion
happening on the jni side, or only in java - i shall do some reading

Example in python:

In [4]: s = zlib.compress("python")

In [5]: repr(s)
Out[5]: "'x\\x9c+\\xa8,\\xc9\\xc8\\xcf\\x03\\x00\\tW\\x02\\xa3'"

In [6]: lucene.JArray_byte(s)
Out[6]: JArray(120, -100, 43, -88, 44, -55, -56, -49, 3, 0, 9, 87, 2, -93)

The same thing in Jython:

>>> s = zlib.compress("python")
>>> s
'x\x9c+\xa8,\xc9\xc8\xcf\x03\x00\tW\x02\xa3'
>>> repr(s)
"'x\\x9c+\\xa8,\\xc9\\xc8\\xcf\\x03\\x00\\tW\\x02\\xa3'"
>>> String(s).getBytes()
array('b', [120, -62, -100, 43, -62, -88, 44, -61, -119, -61, -120,
-61, -113, 3, 0, 9, 87, 2, -62, -93])
>>> String(s).getBytes('utf8')
array('b', [120, -62, -100, 43, -62, -88, 44, -61, -119, -61, -120,
-61, -113, 3, 0, 9, 87, 2, -62, -93])
>>> String(s).getBytes('utf16')
array('b', [-2, -1, 0, 120, 0, -100, 0, 43, 0, -88, 0, 44, 0, -55, 0,
-56, 0, -49, 0, 3, 0, 0, 0, 9, 0, 87, 0, 2, 0, -93])
>>> String(s).getBytes('ascii')
array('b', [120, 63, 43, 63, 44, 63, 63, 63, 3, 0, 9, 87, 2, 63])




Roman

On Thu, Feb 24, 2011 at 3:42 AM, Andi Vajda  wrote:
>
> On Thu, 24 Feb 2011, Roman Chyla wrote:
>
>> I would like to transfer results from python to java:
>>
>> hello = zlib.compress("hello")
>>
>> on the java side do:
>>
>> byte[] data = string.getBytes()
>>
>> But I am not successful. Is there any translation going on somewhere?
>
> Can you be more specific ?
> Actual lines of code, errors, expected results, actual results...
>
> An array of bytes in JCC is not created with a string but a
> JArray('byte')(len or str)
>
>  >>> import lucene
>  >>> lucene.initVM()
>  
>  >>> lucene.JArray('byte')(10)
>  JArray(0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
>  >>> lucene.JArray('byte')("abcd")
>  JArray(97, 98, 99, 100)
>  >>>
>
> Andi..
>


Re: pass compressed string

2011-02-23 Thread Andi Vajda


On Thu, 24 Feb 2011, Roman Chyla wrote:


I would like to transfer results from python to java:

hello = zlib.compress("hello")

on the java side do:

byte[] data = string.getBytes()

But I am not successful. Is there any translation going on somewhere?


Can you be more specific ?
Actual lines of code, errors, expected results, actual results...

An array of bytes in JCC is not created with a string but a 
JArray('byte')(len or str)


  >>> import lucene
  >>> lucene.initVM()
  
  >>> lucene.JArray('byte')(10)
  JArray(0, 0, 0, 0, 0, 0, 0, 0, 0, 0)
  >>> lucene.JArray('byte')("abcd")
  JArray(97, 98, 99, 100)
  >>>

Andi..


pass compressed string

2011-02-23 Thread Roman Chyla
Hello,

I would like to transfer results from python to java:

hello = zlib.compress("hello")

on the java side do:

byte[] data = string.getBytes()

But I am not successful. Is there any translation going on somewhere?

Thank you,

  Roman