Uri, Bernado, all,

Thanks for your thoughts.
Here's a more detailed explanation:
We are reading in feeds (typically TSV or CSV files), parsing them,
reformatting the data, and spitting it out as another file.
These feeds can come from all sorts of people/places/things.
So sometimes they are wonderfully formatted and we understand their content.
Sometime they are not, and we do not.

Imagine a feed that is a list of products.
Each row lists the name, type, description, and price of a product.
Now say you have someone using some sort of CMS to create their feed.
No say that they have all of their information stored in Word files.
It is very easy to see the scenario where the person will just copy and
pastes the description (with line breaks) into their CMS.
Their CMS may just enclose this field in quotes and move on.
So now we have a feed where the description column may have linebreaks in
it.
So I can't just split on any form of linebreak.

Does this make a bit more sense?

btw, there's no chance that I could define $/ as a regex could I?

Thanks again for the help.
--Alex
On Jan 24, 2008 10:03 AM, Bernardo Rechea <[EMAIL PROTECTED]> wrote:

> On Wednesday 23 January 2008 19:26, Uri Guttman wrote:
> > define inline break. it can't be a newline or CR as those define
> > lines. you need clearer specs and data examples if you want more help.
>
> Expanding on Uri's comment, what do you want to do with the files
> while "dealing" with line terminators. Are you simply trying to normalize
> line terminators from different platforms to a common one? Also, by
> 'inline
> break' perhaps you mean what's often called a "hard" line break, i.e.,
> when
> distinguishing between paragraphs and lines, one that doesn't define a
> paragraph ending, but simply cuts lines at a certain length? In that case
> you
> need something different to terminate paragraphs, e.g., a blank line
> (two "line breaks" in a row), or similar.
>
> Bernardo
>
> _______________________________________________
> Boston-pm mailing list
> Boston-pm@mail.pm.org
> http://mail.pm.org/mailman/listinfo/boston-pm
>
 
_______________________________________________
Boston-pm mailing list
Boston-pm@mail.pm.org
http://mail.pm.org/mailman/listinfo/boston-pm

Reply via email to