it depends of the row, they did only share 5% of the qualifiers names. Each row could have about 500-3000 columns in 3 column families. One of them has 80% of the columns.
The table has around 75M of rows. El mar., 28 may. 2019 a las 17:33, <s...@comcast.net> escribió: > Guillermo > > > How large is your table? How many columns? > > > Sincerely, > > Sean > > > On May 28, 2019 at 10:11 AM Guillermo Ortiz <konstt2...@gmail.com > mailto:konstt2...@gmail.com > wrote: > > > > > > I have a doubt. When you process a Hbase table with MapReduce you > could use > > the TableInputFormat, I understand that it goes directly to HDFS > files > > (storesFiles in HDFS) , so you could do some filter in the map phase > and > > it's not the same to go through to the region servers to do some > massive > > queriesIt's possible to do the same using TableInputFormat with > Spark and > > it's more efficient than use scan with filters and so on (again) > when you > > want to do a massive query about all the table. Am I right? > > >