Actually, it seems that the line causing my problem really was missing a column. I checked the behavior of StringToArrayConverter in org.apache.phoenix.util.csv, and it does not exhibit such behavior.
So the fault is on my end. Thanks From: Cox, Jonathan A Sent: Wednesday, March 30, 2016 3:36 PM To: 'user@phoenix.apache.org' Subject: Problem Bulk Loading CSV with Empty Value at End of Row I am using the CsvBulkLoaderTool to ingest a tab separated file that can contain empty columns. The problem is that the loader incorrectly interprets an empty last column as a non-existent column (instead of as an null entry). For example, imagine I have a comma separated CSV with the following format: key,username,password,gender,position,age,school,favorite_color Now, let's say my CSV file contains the following row, where the gender field is missing. This will load correctly: *#Ssj289,joeblow,sk29ssh, ,CEO,102,MIT,blue<new line> However, if the missing field happens to be the last entry (favorite_color), it complains that there are only 7 of 8 required columns present: *#Ssj289,joeblow,sk29ssh,female ,CEO,102,MIT, <new line> This behavior will throw an error and fail to load the entire CSV file. Any pointers on how I can modify the source to have Phoenix interpret <delimiter><newline> as an empty/null last column? Thanks, Jon (actual error is pasted below) java.lang.Exception: java.lang.RuntimeException: java.lang.IllegalArgumentException: CSV record does not have enough values (has 26, but needs 27) at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522) Caused by: java.lang.RuntimeException: java.lang.IllegalArgumentException: CSV record does not have enough values (has 26, but needs 27) at org.apache.phoenix.mapreduce.FormatToBytesWritableMapper.map(FormatToBytesWritableMapper.java:197) at org.apache.phoenix.mapreduce.FormatToBytesWritableMapper.map(FormatToBytesWritableMapper.java:72) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.IllegalArgumentException: CSV record does not have enough values (has 26, but needs 27) at org.apache.phoenix.util.csv.CsvUpsertExecutor.execute(CsvUpsertExecutor.java:74) at org.apache.phoenix.util.csv.CsvUpsertExecutor.execute(CsvUpsertExecutor.java:44) at org.apache.phoenix.util.UpsertExecutor.execute(UpsertExecutor.java:133) at org.apache.phoenix.mapreduce.FormatToBytesWritableMapper.map(FormatToBytesWritableMapper.java:166) ... 10 more 16/03/30 15:01:01 INFO mapreduce.Job: Job job_local1507432235_0