Re: CSV Import Update

2007-07-20 Thread Derek Atkins
Josh Sled [EMAIL PROTECTED] writes:

 Benjamin Sperisen [EMAIL PROTECTED] writes:
 What can the importer do now?
 4. It can handle three different column types: Amount, Date, and
 Description. The columns can be in any order.

 Other columns you may want to support:

 - transaction number
 - transaction identifier
 - comment

 It might be that the easiest thing to do is just concatenate some of those
 fields together into the memo of the resultant transaction, rather than try
 to get a 1-to-1 mapping to gnucash fields.

Maybe provide a few options for how to translate these columns?  I
could surely see one mapping into the Num column.  Maybe one
tying into the Action, and then there's the Transaction Notes field
and the individual Split Memo fields.

 5. This is not necessarily a CSV import issue, but the date parsing
 function does not yet take into account the issue raised by Thomas
 (https://lists.gnucash.org/pipermail/gnucash-devel/2007-July/020954.html).

 I'm not sure if the mixed-separator issue is significant outside of QIF
 files, or otherwise wide-spread.

I'd agree that we shouldn't worry about this now.  We can do other
magic to figure out the year, even just hardcoding (for now)
1969-2060 and then worrying about it again in another 50 years.
If your code is still here in 50 years then I'll be extremely
impressed!

 I did notice that converting the txt-format downloads I can get from Bank
 of America (which look like `lynx -dump`ed versions of their html, honestly)
 into CSV left me with just 'mm/dd' dates (no year).  That doesn't seem all
 that unreasonable, and should be supported as well.

Agreed, it should handle this by assuming either current year or
+/- 6 months depending on a preference (IMHO).

 Problem 4 will largely take time, but if anyone has any CSV files that
 they'd like to try this with, feel free -- the code should be stable
 enough that it's not totally unusable at this point. As long as you

 Another thing I noticed with my manufactured CSV file was that some of the
 lines had '$'s before the value, and some had column-based credit/debit value
 distinctions; I could imagine other CSV-providers that used +/- for such a
 distinction.  This might end up as an RFE building on this project, but
 handling these cases seems pretty important.

Yes, it should accept:

  CurrencySymbol(maybe this can signify a currency, or just be ignored)
  +/- (explicit positive/negative)
  (value)   (explicit negative)

 encountered that before (though I could be wrong). I'll also have to
 learn some about regular expressions, as I know next to nothing about
 them, to add the functionality.

 FWIW, regexp is one of the most useful things I've ever learned, so don't
 hesitate! :)

Absolutely.  I highly recommend you learn regex if it's at all useful to you.

-derek

-- 
   Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
   Member, MIT Student Information Processing Board  (SIPB)
   URL: http://web.mit.edu/warlord/PP-ASEL-IA N1NWH
   [EMAIL PROTECTED]PGP key available
___
gnucash-devel mailing list
gnucash-devel@gnucash.org
https://lists.gnucash.org/mailman/listinfo/gnucash-devel


Re: CSV Import Update

2007-07-19 Thread Josh Sled
Benjamin Sperisen [EMAIL PROTECTED] writes:
 What can the importer do now?
 4. It can handle three different column types: Amount, Date, and
 Description. The columns can be in any order.

Other columns you may want to support:

- transaction number
- transaction identifier
- comment

It might be that the easiest thing to do is just concatenate some of those
fields together into the memo of the resultant transaction, rather than try
to get a 1-to-1 mapping to gnucash fields.


 5. This is not necessarily a CSV import issue, but the date parsing
 function does not yet take into account the issue raised by Thomas
 (https://lists.gnucash.org/pipermail/gnucash-devel/2007-July/020954.html).

I'm not sure if the mixed-separator issue is significant outside of QIF
files, or otherwise wide-spread.

I did notice that converting the txt-format downloads I can get from Bank
of America (which look like `lynx -dump`ed versions of their html, honestly)
into CSV left me with just 'mm/dd' dates (no year).  That doesn't seem all
that unreasonable, and should be supported as well.


 Problem 4 will largely take time, but if anyone has any CSV files that
 they'd like to try this with, feel free -- the code should be stable
 enough that it's not totally unusable at this point. As long as you

Another thing I noticed with my manufactured CSV file was that some of the
lines had '$'s before the value, and some had column-based credit/debit value
distinctions; I could imagine other CSV-providers that used +/- for such a
distinction.  This might end up as an RFE building on this project, but
handling these cases seems pretty important.


 encountered that before (though I could be wrong). I'll also have to
 learn some about regular expressions, as I know next to nothing about
 them, to add the functionality.

FWIW, regexp is one of the most useful things I've ever learned, so don't
hesitate! :)


 Anyway, that's where the code is and where it still needs to go. As
 always, let me know if you have any comments or suggestions!

This is looking very good... very nice work, so far. :)

-- 
...jsled
http://asynchronous.org/ - a=jsled; b=asynchronous.org; echo [EMAIL PROTECTED]


pgpsUIhrzPrGu.pgp
Description: PGP signature
___
gnucash-devel mailing list
gnucash-devel@gnucash.org
https://lists.gnucash.org/mailman/listinfo/gnucash-devel


CSV Import Update

2007-07-15 Thread Benjamin Sperisen
Hi Josh and Christian,

I thought I'd just write a summary of the status of the CSV import
code, and what still needs to be done.

What can the importer do now?
1. It can handle different encodings.
2. It can handle different separators and various date formats.
3. It is date-separator agnostic. (This is using the regex recommended
by Derek : 
https://lists.gnucash.org/pipermail/gnucash-devel/2007-July/020948.html)
4. It can handle three different column types: Amount, Date, and
Description. The columns can be in any order.
5. If there is an error with any of the rows, it will interpret the
rows that do not have errors, and then let the user try changing the
configuration to parse the remaining rows.

What shortcomings does the importer have?
1. When errors do occur, the user is not informed of what they are,
just which rows have them.
2. There is not yet any support for fixed-width files.
3. The column type selection interface is somewhat clumsy. You have to
click on the column to select the row, and then click, hold and move
to select a new type. I don't think this is particularly obvious, but
this is the best I can think of with stock GTK+ widgets.
4. It has not undergone extensive testing.
5. This is not necessarily a CSV import issue, but the date parsing
function does not yet take into account the issue raised by Thomas
(https://lists.gnucash.org/pipermail/gnucash-devel/2007-July/020954.html).
(It is also hiding as a static function in one of my source files, but
it could easily be moved.)

Problems 1 and 2 are just a matter of coding, and I know at least
roughly how to attack them.

Problem 3 is a bit trickier. I'm thinking it could be slightly
improved by automatically selecting the treeview row when the dialog
is shown, but I'm wondering if there is a more elegant interface for
this sort of thing. I did a lot of experimentation with an hbox of
comboboxes that would resize automatically by doing size requests (to
align themselves with the treeview below), but that didn't work very
well, since when I expanded the window, I couldn't shrink back.

Problem 4 will largely take time, but if anyone has any CSV files that
they'd like to try this with, feel free -- the code should be stable
enough that it's not totally unusable at this point. As long as you
don't try to use a fixed-width file, you should be able to import your
CSV file (and if it doesn't work, any feedback on that is definitely
welcome!). I've also committed an example file in my branch at
gnucash/src/import-export/csv/example-file.csv.

Finally, I'm thinking I can solve problem 5 by maybe adding a boolean
argument to the function to specify that the string should be parsed
as QIF date string (to take into account apostrophes). I feel fairly
safe ignoring apostrophes in CSV land, just because I've never
encountered that before (though I could be wrong). I'll also have to
learn some about regular expressions, as I know next to nothing about
them, to add the functionality.

Anyway, that's where the code is and where it still needs to go. As
always, let me know if you have any comments or suggestions!

Regards,
Benny
___
gnucash-devel mailing list
gnucash-devel@gnucash.org
https://lists.gnucash.org/mailman/listinfo/gnucash-devel