reviewing quickly JsonLocation is only useful when there is an exception
"you suck at line 3, column 6, offset 18". So we need to be able to open
it, go here in gedit/notepad++/other and check the syntax error...otherwise
whatever clever counting is done it is really useless.

If for passing tcks we need to break it we'll do but I'm sure we'll keep
this as default, no?



Romain Manni-Bucau
Twitter: @rmannibucau
Blog: http://rmannibucau.wordpress.com/
LinkedIn: http://fr.linkedin.com/in/rmannibucau
Github: https://github.com/rmannibucau


2014-07-24 13:42 GMT+02:00 Hendrik Dev <[email protected]>:

> On Thu, Jul 24, 2014 at 1:35 PM, Romain Manni-Bucau
> <[email protected]> wrote:
> > Think we start with 1. But for column I don't really care, we can align
> on
> > RI.
> >
> > For offset not sure what is complicated but we should ensure offset
> > corresponds to the sum of previously parsed columns for all lines. While
> > this is consistent global system works.
>
> i agree but API says IMHO different things (column is always chars,
> offset can be bytes or chars according to jsr)
>
> my proposal is to keeps thing easy for now until we have tck. Will
> start column with 1 (its common and expected IMHO) and defer byte/char
> count stuff until tck arrives.
> RI is also not counting different for bytes and chars.
>
> Kind regards
> Hendrik
>
>
> >
> >
> >
> > Romain Manni-Bucau
> > Twitter: @rmannibucau
> > Blog: http://rmannibucau.wordpress.com/
> > LinkedIn: http://fr.linkedin.com/in/rmannibucau
> > Github: https://github.com/rmannibucau
> >
> >
> > 2014-07-24 12:39 GMT+02:00 Hendrik Dev <[email protected]>:
> >
> >> doing this efficiently is more complicated than i thought. Can we not
> >> simply just count 2 bytes for one char ;-)
> >>
> >> BTW, seem the JsonLocation column value leave also room for
> interpretation:
> >>
> >> Is the most left column 0 or 1? Texteditors for example start with
> >> column 1 (there is never a column 0) but RI starts with 0.
> >>
> >> Regards
> >> Hendrik
> >>
> >>
> >> On Wed, Jul 23, 2014 at 1:49 PM, Hendrik Dev <[email protected]>
> >> wrote:
> >> > agree, will make it so
> >> >
> >> > On Wed, Jul 23, 2014 at 1:28 PM, Romain Manni-Bucau
> >> > <[email protected]> wrote:
> >> >> Hi
> >> >>
> >> >> I agree wording is wrong but IMO it is not ambiguous: we get an
> >> inputstream
> >> >> or reader (and we *don't* want to check it is a file or not) so we
> just
> >> >> count the chars or bytes we read. All other implementation would
> lead to
> >> >> confusion IMO (make default text file reader compliant friendly).
> >> >>
> >> >> We can start this way and if we have issues go further but I really
> >> doubt
> >> >> we need it.
> >> >>
> >> >> What's your opinion?
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> Romain Manni-Bucau
> >> >> Twitter: @rmannibucau
> >> >> Blog: http://rmannibucau.wordpress.com/
> >> >> LinkedIn: http://fr.linkedin.com/in/rmannibucau
> >> >> Github: https://github.com/rmannibucau
> >> >>
> >> >>
> >> >> 2014-07-23 13:21 GMT+02:00 Hendrik Dev <[email protected]>:
> >> >>
> >> >>> Hi,
> >> >>>
> >> >>> the JSR 353 API says about JsonLocation.getStreamOffset()
> >> >>>
> >> >>> "long getStreamOffset()
> >> >>>
> >> >>> Return the stream offset into the input source this location is
> >> >>> pointing to. If the input source is a file or a byte stream then
> this
> >> >>> is the byte offset into that stream, but if the input source is a
> >> >>> character media then the offset is the character offset. Returns -1
> if
> >> >>> there is no offset available."
> >> >>>
> >> >>> There are IMHO two issues here:
> >> >>>
> >> >>> 1) How can we know that the input source is a file(stream)? We can
> >> >>> only know if the parser  read from an Inputstream (=byte stream) or
> >> >>> from an Reader (=character stream). Wording here is
> unclear/ambiguous.
> >> >>>
> >> >>> 2) Since a UTF8 or UTF16 character can map to one, two, three or
> four
> >> >>> bytes the output can be very confusing (especially if the user don't
> >> >>> know whether the parser was constructed form a byte or character
> >> >>> stream and which charset is used).
> >> >>>
> >> >>> Seems that the RI is not implementing these distinctions, if i
> looked
> >> >>> correctly they always return character offsets.
> >> >>>
> >> >>> So want we want do to?
> >> >>>
> >> >>> Thanks
> >> >>> Hendrik
> >> >>>
> >> >>>
> >> >>> --
> >> >>> Hendrik Saly (salyh, hendrikdev22)
> >> >>> @hendrikdev22
> >> >>> PGP: 0x22D7F6EC
> >> >>>
> >> >
> >> >
> >> >
> >> > --
> >> > Hendrik Saly (salyh, hendrikdev22)
> >> > @hendrikdev22
> >> > PGP: 0x22D7F6EC
> >>
> >>
> >>
> >> --
> >> Hendrik Saly (salyh, hendrikdev22)
> >> @hendrikdev22
> >> PGP: 0x22D7F6EC
> >>
>
>
>
> --
> Hendrik Saly (salyh, hendrikdev22)
> @hendrikdev22
> PGP: 0x22D7F6EC
>

Reply via email to