OpenCSV (http://opencsv.sourceforge.net <http://opencsv.sourceforge.net/>) is an Apache licensed library that’s been working well for us for a while.
Straight tokenizing works well if there aren’t newlines or commas or any other meaningful characters in the content. Even “last name, firstname middleman” will cause problems because of the comma. Most CSV libraries/parsers are a little more complicated than just tokenizing. If you’re in control of the input data and know that no delimiters, quoting characters or escape characters will be part of the data, a tokenize strategy will work well… otherwise, it might be worthwhile to look at a library like OpenCSV. The process works well whether you’re slurping an entire file in for processing in memory or streaming lines from a really large file (or files). As far as getting the data into EO, I create a dictionary from the csv record and use that to initialize the EO. (It helps to have the field labels the same as the keys in the EO) Larry Mills-Gahl [email protected] > On Apr 17, 2015, at 10:05 AM, CHRISTOPH WICK | i4innovation GmbH, Bonn > <[email protected]> wrote: > > Old NeXT-style would be using the method "componentsSeparatedByString" of > NSArray. > > So, this is how I make it (assuming '\n' is used for linebreak and '\t' for > cell) and Apache Commons IO for file reading: > > String contentOfFile = FileUtils.readFileToString("/my/file.csv", "UTF-8"); > NSArray<String> lines = NSArray.componentsSeparatedByString(contentOfFile, > "\n"); > for (String line : lines) { > NSArray<String> cells = NSArray.componentsSeparatedByString(line, "\t"); > for (String cell : cells) { > NSLog.out.appendln("Got cell" + cell); > } > } > > C.U.CW > -- > The three great virtues of a programmer are Laziness, Impatience and Hubris. > (Randal Schwartz) > >> On 16 Apr 2015, at 09:59, Fabian Peters <[email protected] >> <mailto:[email protected]>> wrote: >> >> Hi Kevin, >> >> My CSV importers use something like this: >> >> Scanner scanner = new Scanner(new FileInputStream(importFile), >> "UTF-8"); >> try { >> // first use a Scanner to get each line >> Integer lineNumber = 0; >> while (scanner.hasNextLine()) { >> String currentLine = scanner.nextLine(); >> lineNumber += 1; >> … >> } finally { >> scanner.close(); >> } >> >> And for handling the individual lines: >> >> StrTokenizer tokenizer = StrTokenizer.getCSVInstance(aLine); >> tokenizer.setDelimiterChar(';'); >> tokenizer.getTokenList(); >> >> for (int i = 0; i < tokenizer.getTokenArray().length; i++) { >> String aToken = tokenizer.getTokenArray()[i]; >> if (!ERXStringUtilities.stringIsNullOrEmpty(aToken)) { >> switch (i) { >> case 0: { >> … >> } >> } >> } >> >> HTH, Fabian >> >>> Am 16.04.2015 um 06:11 schrieb Kevin @ alchemy POP >>> <[email protected] <mailto:[email protected]>>: >>> >>> I've got a simple Wonder project where I want to populate the database by >>> importing a csv file. Before adding the data to the table I intend to do >>> some cleanup on it, ie, checking for unwanted characters, etc. Is there an >>> approved Wonder approach to reading a text file, or is it just straight >>> java until it's time to actually add the records to the database? >>> >>> I looked through the Wonder docs but did not see anything related to basic >>> text file i/o. >>> >>> Thanks, >>> >>> >>> Kevin Spake >>> Alchemy Billing LLC >>> 916-397-9953: Phone >>> 866-594-0847: Fax >>> >>> IMPORTANT WARNING: This message is intended for the use of the person or >>> entity to which it is addressed and may contain information that is >>> priviledged and confidential, the disclosure of which s governed by >>> applicable laws. If the reader of this message is not the intended >>> recipient, you are hereby notified that any dissemination, distribution, or >>> copying of this message is STRICTLY PROHIBITED. If you have received this >>> message in error, please notify us immediately at 1(916)723-7029 and delete >>> the related message. You, the recipient, are obligated to maintain it in a >>> safe, secure, and confidential manner. Re-disclosure without additional >>> consent or as permitted by the law is prohibited. Unauthorized >>> re-disclosure or failure to maintain confidentiality could subject you to >>> penalties described in the Federal and State Laws. >>> >>> _______________________________________________ >>> Do not post admin requests to the list. They will be ignored. >>> Webobjects-dev mailing list ([email protected]) >>> Help/Unsubscribe/Update your Subscription: >>> https://lists.apple.com/mailman/options/webobjects-dev/lists.fabian%40e-lumo.com >>> >>> <https://lists.apple.com/mailman/options/webobjects-dev/lists.fabian%40e-lumo.com> >>> >>> This email sent to [email protected] <mailto:[email protected]> >> >> >> _______________________________________________ >> Do not post admin requests to the list. They will be ignored. >> Webobjects-dev mailing list ([email protected] >> <mailto:[email protected]>) >> Help/Unsubscribe/Update your Subscription: >> https://lists.apple.com/mailman/options/webobjects-dev/cw%40i4innovation.de >> <https://lists.apple.com/mailman/options/webobjects-dev/cw%40i4innovation.de> >> >> This email sent to [email protected] <mailto:[email protected]> > > _______________________________________________ > Do not post admin requests to the list. They will be ignored. > Webobjects-dev mailing list ([email protected] > <mailto:[email protected]>) > Help/Unsubscribe/Update your Subscription: > https://lists.apple.com/mailman/options/webobjects-dev/elemgee%40gmail.com > <https://lists.apple.com/mailman/options/webobjects-dev/elemgee%40gmail.com> > > This email sent to [email protected] <mailto:[email protected]>
signature.asc
Description: Message signed with OpenPGP using GPGMail
_______________________________________________ Do not post admin requests to the list. They will be ignored. Webobjects-dev mailing list ([email protected]) Help/Unsubscribe/Update your Subscription: https://lists.apple.com/mailman/options/webobjects-dev/archive%40mail-archive.com This email sent to [email protected]
