So, the solution is to just provide a patch with more cases for escaping in
http://trac.openstreetmap.org/browser/applications/utils/osmosis/src/com/bretth/osmosis/core/xml/common/ProductionDbDataDecoder.java
http://trac.openstreetmap.org/browser/applications/utils/osmosis/src/com/bretth/osmosis/core/xml/common/ProductionDbDataEncoder.java
and hope they work fine?

It would of course be better in a long run to fix the main DB, but I'm
not sure what all this brings along. Probably a lot.

Stefan

On Dec 19, 2007 10:36 PM, Brett Henderson <[EMAIL PROTECTED]> wrote:
> Hi All,
>
> I've lost my home ADSL (won't line sync, tried two modems, tried different
> leads, doesn't seem to be my end) so I'm mostly offline.  As a result I'm
> unlikely to get onto this issue in the short term.  With Christmas
> approaching I'm bracing myself for a long'ish outage.
>
> If anybody wishes to take a look, the hacked character encoding class is
> named ProductionDbCharset and has two related classes named
> ProductionDbDataEncoder and ProductionDbDataDecoder.
>
> The classes are instantiated within BaseXmlWriter which is extended by the
> XmlWriter class for writing osm files and XmlChangeWriter for osc files.
> The hack works by just passing the doubly encoded data through the osmosis
> pipeline then fixing it before writing to xml.
>
> Not sure how easy it will be to fix without access to a doubly encoded
> database though.
>
> Brett
>
>
>
> On 12/20/07, Martijn van Oosterhout < [EMAIL PROTECTED]> wrote:
> >
> >
> >
> > On Dec 18, 2007 1:04 PM, Stefan Baebler < [EMAIL PROTECTED]> wrote:
> > > I somehow assumed utf8 would be the default choice by now. Also
> > > http://wiki.openstreetmap.org/index.php/Database_schema
> > > mentions utf8 explicitly for every table individually.
> > >
> > > Why does main api work nicely then?
> > > Why are full planet dumps ok?
> >
> > There's an encoding issue in that what the ruby server thinks it is is
> > different from what the datavase encoding actually is. The net result
> > is that the data is encoded *twice*. For example (not actual codes,
> > just examples):
> >
> > Original char: character 0xef
> > Encoded as: 0xc3 0xaf
> > Stored as: 0xc0 0xc3 0xc0 0xbf
> >
> > > And more importantly:
> > > How can same magic be used to get properly utf8 encoded hourly changes
> (.osc)?
> >
> > Osmosis is in Java which is smart enough to not let you do stupid
> > thing like getting the database connection encoding wrong. It's just a
> > question of fixing the de-double-encoding-hack in osmosis. It doesn't
> > help that it's a *windows* encoding in the first step.
> >
> > Have a nice day,
> > --
> > Martijn van Oosterhout <[EMAIL PROTECTED]> http://svana.org/kleptog/
> >
> > _______________________________________________
> >
> > dev mailing list
> > dev@openstreetmap.org
> > http://lists.openstreetmap.org/cgi-bin/mailman/listinfo/dev
> >
>
>
> _______________________________________________
> dev mailing list
> dev@openstreetmap.org
> http://lists.openstreetmap.org/cgi-bin/mailman/listinfo/dev
>
>

_______________________________________________
dev mailing list
dev@openstreetmap.org
http://lists.openstreetmap.org/cgi-bin/mailman/listinfo/dev

Reply via email to