Hi Backslash is the default escape character that is used for parsing CSV data when running a bulk import, so it has a special meaning.
You can supply a different (custom) escape character with the -e or --escape flag on the command line so that parsing your CSV files that include backslashes like this will run properly. - Gabriel On Thu, Dec 8, 2016 at 9:10 AM, rubysina <[email protected]> wrote: > hi, I'm new to phoenix sql and here's a little problem. > > I'm following this page http://phoenix.apache.org/bulk_dataload.html > I just found that the MapReduce importer could not load file with lines > ended with backslash > even with the -g parameter , i.e. ignore-errors, "java.io.IOException: EOF > whilst processing escape sequence" > > but it's OK if the line contains backslash but not at the end of line, > > and there's no problem when using psql.py to load the same file. > > why? how? > > thank you. > > > > ----------------------------------------------------------------------------------------------- > for example: > > > create table a(a char(100) primary key) > > echo \\>a.csv > cat a.csv > \ > hdfs dfs -put a.csv > ...JsonBulkLoadTool -g -t a -i a.csv > -- error > 16/12/08 15:44:21 INFO mapreduce.Job: Task Id : > attempt_1481093434027_0052_m_000000_0, Status : FAILED > Error: java.lang.RuntimeException: java.lang.RuntimeException: > java.io.IOException: EOF whilst processing escape sequence > at > org.apache.phoenix.mapreduce.FormatToBytesWritableMapper.map(FormatToBytesWritableMapper.java:202) > at > org.apache.phoenix.mapreduce.FormatToBytesWritableMapper.map(FormatToBytesWritableMapper.java:74) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) > Caused by: java.lang.RuntimeException: java.io.IOException: EOF whilst > processing escape sequence > at > org.apache.commons.csv.CSVParser$1.getNextRecord(CSVParser.java:398) > at org.apache.commons.csv.CSVParser$1.hasNext(CSVParser.java:407) > at com.google.common.collect.Iterators.getNext(Iterators.java:890) > at com.google.common.collect.Iterables.getFirst(Iterables.java:781) > at > org.apache.phoenix.mapreduce.CsvToKeyValueMapper$CsvLineParser.parse(CsvToKeyValueMapper.java:109) > at > org.apache.phoenix.mapreduce.CsvToKeyValueMapper$CsvLineParser.parse(CsvToKeyValueMapper.java:91) > at > org.apache.phoenix.mapreduce.FormatToBytesWritableMapper.map(FormatToBytesWritableMapper.java:161) > ... 9 more > > > > echo \\a>a.csv > cat a.csv > \a > hdfs dfs -rm a.csv > hdfs dfs -put a.csv > ...JsonBulkLoadTool -g -t a -i a.csv > -- success > > > echo \\>a.csv > cat a.csv > \ > psql.py -t A zoo a.csv > CSV Upsert complete. 1 rows upserted > -- success > > > thank you.
