Hi Jean-Marc I decided to create a composite key *ticker-date* from the csv file
I just did some manipulation on CSV file export IFS=",";sed -i 1d tsco.csv; cat tsco.csv | while read a b c d e f; do echo "TSCO-$a,TESCO PLC,TSCO,$a,$b,$c,$d,$e,$f"; done > temp; mv -f temp tsco.csv Which basically takes the csv file, tells the shell that field separator IFS=",", drops the header, reads every field in every line (1,b,c ..), creates the composite key TSCO-$a, adds the stock name and ticker to the csv file. The whole process can be automated and parameterised. Once the csv file is put into HDFS then, I run the following command $HBASE_HOME/bin/hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.separator=',' -Dimporttsv.columns="HBASE_ROW _KEY,stock_info:stock,stock_info:ticker,stock_daily:Date,sto ck_daily:open,stock_daily:high,stock_daily:low,stock_daily: close,stock_daily:volume" tsco hdfs://rhes564:9000/data/stocks/tsco.csv The Hbase table is created as below create 'tsco','stock_info','stock_daily' and this is the data (2 rows each 2 family and with 8 attributes) hbase(main):132:0> scan 'tsco', LIMIT => 2 ROW COLUMN+CELL TSCO-1-Apr-08 column=stock_daily:Date, timestamp=1475507091676, value=1-Apr-08 TSCO-1-Apr-08 column=stock_daily:close, timestamp=1475507091676, value=405.25 TSCO-1-Apr-08 column=stock_daily:high, timestamp=1475507091676, value=406.75 TSCO-1-Apr-08 column=stock_daily:low, timestamp=1475507091676, value=379.25 TSCO-1-Apr-08 column=stock_daily:open, timestamp=1475507091676, value=380.00 TSCO-1-Apr-08 column=stock_daily:volume, timestamp=1475507091676, value=49664486 TSCO-1-Apr-08 column=stock_info:stock, timestamp=1475507091676, value=TESCO PLC TSCO-1-Apr-08 column=stock_info:ticker, timestamp=1475507091676, value=TSCO TSCO-1-Apr-09 column=stock_daily:Date, timestamp=1475507091676, value=1-Apr-09 TSCO-1-Apr-09 column=stock_daily:close, timestamp=1475507091676, value=333.30 TSCO-1-Apr-09 column=stock_daily:high, timestamp=1475507091676, value=334.60 TSCO-1-Apr-09 column=stock_daily:low, timestamp=1475507091676, value=326.50 TSCO-1-Apr-09 column=stock_daily:open, timestamp=1475507091676, value=331.10 TSCO-1-Apr-09 column=stock_daily:volume, timestamp=1475507091676, value=24877341 TSCO-1-Apr-09 column=stock_info:stock, timestamp=1475507091676, value=TESCO PLC TSCO-1-Apr-09 column=stock_info:ticker, timestamp=1475507091676, value=TSCO What do you think? Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On 3 October 2016 at 15:10, Jean-Marc Spaggiari <jean-m...@spaggiari.org> wrote: > Hi Mich, > > As you said, it's most probably because it's all the same key... If you > want to be 200% sure, just alter VERSIONS => '1' to be greater (like, 10) > and scan all the versions of the cells. You should see the others. > > JMS > > 2016-10-03 3:41 GMT-04:00 Mich Talebzadeh <mich.talebza...@gmail.com>: > > > Hi, > > > > when I use the command line utility ImportTsv to load a file into Hbase > > with the following table format > > > > describe 'marketDataHbase' > > Table marketDataHbase is ENABLED > > marketDataHbase > > COLUMN FAMILIES DESCRIPTION > > {NAME => 'price_info', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY > => > > 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', > TTL > > => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKC > > ACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'} > > 1 row(s) in 0.0930 seconds > > > > > > hbase org.apache.hadoop.hbase.mapreduce.ImportTsv > > -Dimporttsv.separator=',' > > -Dimporttsv.columns="HBASE_ROW_KEY, stock_daily:ticker, > > stock_daily:tradedate, stock_daily:open,stock_daily: > > high,stock_daily:low,stock_daily:close,stock_daily:volume" tsco > > hdfs://rhes564:9000/data/stocks/tsco.csv > > > > There are with 1200 rows in the csv file,* but it only loads the first > > row!* > > > > scan 'tsco' > > ROW COLUMN+CELL > > Tesco PLC > > column=stock_daily:close, timestamp=1475447365118, value=325.25 > > Tesco PLC > > column=stock_daily:high, timestamp=1475447365118, value=332.00 > > Tesco PLC > > column=stock_daily:low, timestamp=1475447365118, value=324.00 > > Tesco PLC > > column=stock_daily:open, timestamp=1475447365118, value=331.75 > > Tesco PLC > > column=stock_daily:ticker, timestamp=1475447365118, value=TSCO > > Tesco PLC > > column=stock_daily:tradedate, timestamp=1475447365118, value= 3-Jan-06 > > Tesco PLC > > column=stock_daily:volume, timestamp=1475447365118, value=46935045 > > 1 row(s) in 0.0390 seconds > > > > Is this because the hbase_row_key --> Tesco PLC is the same for all? I > > thought that the row key can be anything. > > > > Thanks > > > > Dr Mich Talebzadeh > > > > > > > > LinkedIn * https://www.linkedin.com/profile/view?id= > > AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw > > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCd > > OABUrV8Pw>* > > > > > > > > http://talebzadehmich.wordpress.com > > > > > > *Disclaimer:* Use it at your own risk. Any and all responsibility for any > > loss, damage or destruction of data or any other property which may arise > > from relying on this email's technical content is explicitly disclaimed. > > The author will in no case be liable for any monetary damages arising > from > > such loss, damage or destruction. > > >