Yes, that's it.  I thought I'd already covered all cases but apparently 
I was wrong.  My home ADSL is back up again so hopefully I'll get a 
chance to check it out soon.

If you see any problems with the current implementation please send a 
patch.  Without production db access it might be difficult though.

Stefan Baebler wrote:
> So, the solution is to just provide a patch with more cases for escaping in
> http://trac.openstreetmap.org/browser/applications/utils/osmosis/src/com/bretth/osmosis/core/xml/common/ProductionDbDataDecoder.java
> http://trac.openstreetmap.org/browser/applications/utils/osmosis/src/com/bretth/osmosis/core/xml/common/ProductionDbDataEncoder.java
> and hope they work fine?
>
> It would of course be better in a long run to fix the main DB, but I'm
> not sure what all this brings along. Probably a lot.
>
> Stefan
>
> On Dec 19, 2007 10:36 PM, Brett Henderson <[EMAIL PROTECTED]> wrote:
>   
>> Hi All,
>>
>> I've lost my home ADSL (won't line sync, tried two modems, tried different
>> leads, doesn't seem to be my end) so I'm mostly offline.  As a result I'm
>> unlikely to get onto this issue in the short term.  With Christmas
>> approaching I'm bracing myself for a long'ish outage.
>>
>> If anybody wishes to take a look, the hacked character encoding class is
>> named ProductionDbCharset and has two related classes named
>> ProductionDbDataEncoder and ProductionDbDataDecoder.
>>
>> The classes are instantiated within BaseXmlWriter which is extended by the
>> XmlWriter class for writing osm files and XmlChangeWriter for osc files.
>> The hack works by just passing the doubly encoded data through the osmosis
>> pipeline then fixing it before writing to xml.
>>
>> Not sure how easy it will be to fix without access to a doubly encoded
>> database though.
>>
>> Brett
>>
>>
>>
>> On 12/20/07, Martijn van Oosterhout < [EMAIL PROTECTED]> wrote:
>>     
>>>
>>> On Dec 18, 2007 1:04 PM, Stefan Baebler < [EMAIL PROTECTED]> wrote:
>>>       
>>>> I somehow assumed utf8 would be the default choice by now. Also
>>>> http://wiki.openstreetmap.org/index.php/Database_schema
>>>> mentions utf8 explicitly for every table individually.
>>>>
>>>> Why does main api work nicely then?
>>>> Why are full planet dumps ok?
>>>>         
>>> There's an encoding issue in that what the ruby server thinks it is is
>>> different from what the datavase encoding actually is. The net result
>>> is that the data is encoded *twice*. For example (not actual codes,
>>> just examples):
>>>
>>> Original char: character 0xef
>>> Encoded as: 0xc3 0xaf
>>> Stored as: 0xc0 0xc3 0xc0 0xbf
>>>
>>>       
>>>> And more importantly:
>>>> How can same magic be used to get properly utf8 encoded hourly changes
>>>>         
>> (.osc)?
>>     
>>> Osmosis is in Java which is smart enough to not let you do stupid
>>> thing like getting the database connection encoding wrong. It's just a
>>> question of fixing the de-double-encoding-hack in osmosis. It doesn't
>>> help that it's a *windows* encoding in the first step.
>>>
>>> Have a nice day,
>>> --
>>> Martijn van Oosterhout <[EMAIL PROTECTED]> http://svana.org/kleptog/
>>>
>>> _______________________________________________
>>>
>>> dev mailing list
>>> dev@openstreetmap.org
>>> http://lists.openstreetmap.org/cgi-bin/mailman/listinfo/dev
>>>
>>>       
>> _______________________________________________
>> dev mailing list
>> dev@openstreetmap.org
>> http://lists.openstreetmap.org/cgi-bin/mailman/listinfo/dev
>>
>>
>>     


_______________________________________________
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/cgi-bin/mailman/listinfo/dev

Reply via email to