Custom line separator is a reasonable request. Please open JIRAs for HBase and/or Phoenix import tools -- and provide a patch, if you're feeling generous ;)
On Thu, Feb 12, 2015 at 10:39 AM, Siva <[email protected]> wrote: > Hi Gabriel, > > Having special character as line separator other than (\n) does not work > with even Hbase ImportTsv. But I found something richImportTsv in git. > > https://github.com/kawaa/RichImportTsv > > But it is 3 years old, was implemented by using old APIs. We should take a > step to rewrite with new API. > > Thanks, > Siva. > > On Wed, Feb 11, 2015 at 11:40 PM, Gabriel Reid <[email protected]> > wrote: > >> Hi Siva, >> >> Handling multi-line records with the Bulk CSV Loader (i.e. >> MapReduce-based loader) definitely won't support records split over >> multiple input lines. It could be that loading via PSQL (as described >> on http://phoenix.apache.org/bulk_dataload.html) will allow multi-line >> records, as this might be supported by the underlying CSV parsing >> library (commons-csv), although I'm not sure. In any case, I can't >> really give you any advice on how to make it work there if it isn't >> working right now. >> >> I assume this also won't work in HBase's ImportTsv. >> >> - Gabriel >> >> >> On Thu, Feb 5, 2015 at 10:28 PM, Siva <[email protected]> wrote: >> > We have table contains a NOTE column, this column contains lines of text >> > separated by new lines. When I export the data from .csv through >> bulkloader, >> > Phoenix is failing with error and Hbase terminates the text till >> encounters >> > the new line and assumes rest of NOTE as new record. >> > >> > >> > >> > Is there a way to specify new line separator in Hbase or Phoenix bulk >> load? >> > >> > >> > >> > With phoenix: >> > >> > >> > >> > >> HADOOP_CLASSPATH=/usr/hdp/2.2.0.0-2041/hbase/lib/hbase-protocol.jar:/usr/hdp/2.2.0.0-2041/hbase/conf >> > hadoop jar >> > /usr/hdp/2.2.0.0-2041/phoenix/phoenix-4.2.0.2.2.0.0-2041-client.jar >> > org.apache.phoenix.mapreduce.CsvBulkLoadTool --table test_leadwarehouse >> > --input /user/sbhavanari/test_leadwarehouse.csv --zookeeper <zookeeper >> > Ip>:2181:/hbase >> > >> > >> > >> > With hbase importtsv: >> > >> > >> > >> > base org.apache.hadoop.hbase.mapreduce.ImportTsv >> '-Dimporttsv.separator=,' >> > -Dimporttsv.columns=<col_list> test_leadwarehouse >> > /user/data/test_leadwarehouse.csv >> > >
