Re: Custom filter appears to just hang

2019-09-17 Thread Jean-Marc Spaggiari
Maybe you can try to surround all your methods with giant try/catch statements and log what ever you might get? Might help to catch an exception that is drop somewhere? Le mar. 17 sept. 2019 à 06:24, Mike Thomsen a écrit : > Do you know of any third party filters that are posted somewhere where

Re: Region in RIT (CLOSING) , How to fix it ?

2019-09-02 Thread Jean-Marc Spaggiari
Hi Syni, Have you tried using HBCK2? JMS Le lun. 2 sept. 2019 07 h 57, Syni Guo a écrit : > > > Hbase version : 2.1.3 > > > There are 2 region in RIT (CLOSING) , How to fix it ? , I try to unassign > it ,but timeout failed . > > > hbase(main):032:0> unassign

Re: HBase 2 ,bulk import question

2019-07-18 Thread Jean-Marc Spaggiari
a écrit : > To add to that, the split will be done on the master, so if you > anticipate a lot of splits it can be an issue. > > -Austin > > On 7/18/19 12:32 PM, Jean-Marc Spaggiari wrote: > > One think to add, when you will bulkload your files, if needed, they wil

Re: HBase 2 ,bulk import question

2019-07-18 Thread Jean-Marc Spaggiari
One think to add, when you will bulkload your files, if needed, they will be split according to the regions boundaries. Because between when you start your job and when you push your files, there might have been some "natural" splits on the table side, the bulkloader has to be able to re-split

Re: Need to apply patch HBASE-8163 on Hbase 1.4.10

2019-07-03 Thread Jean-Marc Spaggiari
I just checked and I can see it in 1.4.10 jmspaggiari@t460s:/stock/hbase-releases/hbase-1.4.10$ grep -R hbase.hregion.memstore.chunkpool.maxsize * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/MemStoreChunkPool.java: final static String CHUNK_POOL_MAXSIZE_KEY =

Re: Re: question on hfile size upper limit

2019-06-20 Thread Jean-Marc Spaggiari
gqiang0...@163.com> wrote: > > > this conf: > > > > hbase.hregion.max.filesize > > 10737418240 > > > > Maximum HStoreFile size. If any one of a column families' HStoreFiles > > has > > grown to exceed this value, the hosting HRegion is split in >

Re: question on hfile size upper limit

2019-06-18 Thread Jean-Marc Spaggiari
Hi, Can you please confirm which parameter you are talking about? The default HBase setting is to limit the size per region (10GB by default), and not by HFiles. This can be configured at the HBase lever, or at the table level. HTH, JMS Le mar. 18 juin 2019 à 11:32, wangyongqiang0...@163.com <

Re: Scan vs TableInputFormat to process data

2019-06-03 Thread Jean-Marc Spaggiari
Also, keep in mind that by bypassing the RegionServer you also bypass the security rules... JMS Le sam. 1 juin 2019 à 21:43, Josh Elser a écrit : > Hi Guillermo, > > Yes, you are missing something. > > TableInputFormat uses the Scan API just like Spark would. > > Bypassing the RegionServer

Re: When I used completebulkload to make the data valid in habse, an exception occurred in the system.

2019-05-20 Thread Jean-Marc Spaggiari
Hi Li Wei, Is your cluster secured? JMS Le lun. 20 mai 2019 13:18, a écrit : > Hello User. > > (HBase version: 2.1.2;Hadoop version:2.9.1)。 > HFILE has been generated in the /hadoop/ACG1/output directory through > mapreduce, but when using completebulkload, the abnormal process occurs as >

Re: Spark HBase Partitionner

2019-04-11 Thread Jean-Marc Spaggiari
> Don't know if there's a partitioner already or not. > > On Thu, Apr 11, 2019, 09:14 Jean-Marc Spaggiari > wrote: > > > Hi, > > > > Do we have an HBase Partitionner for Spark? It's pretty straight forward > > but I'm not able to find one. If we don't, I wi

Spark HBase Partitionner

2019-04-11 Thread Jean-Marc Spaggiari
Hi, Do we have an HBase Partitionner for Spark? It's pretty straight forward but I'm not able to find one. If we don't, I will do it and contribute it back, but then where should it be pushed? Thanks, JMS

Re: illegal reflective access

2019-04-10 Thread Jean-Marc Spaggiari
Thanks. Added myself to those cases. Le mer. 10 avr. 2019 à 12:58, Sean Busbey a écrit : > also being tracked in HBASE-22172 and HBASE-21110 > > On Wed, Apr 10, 2019 at 8:39 AM Jean-Marc Spaggiari > wrote: > > > > Didn't even realized I was running JDK11 ;) Side effe

Re: illegal reflective access

2019-04-10 Thread Jean-Marc Spaggiari
aring) So for now I will ignore that. Thanks, JMS Le mer. 10 avr. 2019 à 09:30, 张铎(Duo Zhang) a écrit : > You're using Java11? HBase has not added the support for Java11 yet... > > Jean-Marc Spaggiari 于2019年4月10日周三 下午9:25写道: > > > Hi, > > > > Does this nee

illegal reflective access

2019-04-10 Thread Jean-Marc Spaggiari
Hi, Does this need to be reported or should I just ignore it? Thanks, JMS WARNING: An illegal reflective access operation has occurred WARNING: Illegal reflective access by org.apache.hadoop.hbase.util.UnsafeAvailChecker (file:/tmp/p-0.0.1-SNAPSHOT-jar-with-dependencies.jar) to method

Re: Difference of n columns with 1 version vs 1 column with n versions

2019-03-30 Thread Jean-Marc Spaggiari
Hi Serkan, This is my personal opinion and some might not share it ;) I tried to go with the deep versions approach for one project and I found issues on some of the calls (pagination over versions as an example). So if for you both (The deep version and wide columns) are the same, I will say,

Re: Aggregation

2019-03-13 Thread Jean-Marc Spaggiari
On Wed, Mar 13, 2019 at 6:41 AM Jean-Marc Spaggiari < > jean-m...@spaggiari.org> > wrote: > > > Hi, > > > > I have a quick question regarding aggregation. > > > > First, let me explain my understanding. I see two types of aggregation. > > >

Aggregation

2019-03-13 Thread Jean-Marc Spaggiari
Hi, I have a quick question regarding aggregation. First, let me explain my understanding. I see two types of aggregation. First is at the column level. Like, AVG(age) on a table. It will, on the server side, for each region, sum the age, and divide by the number of rows. Fine. Second is at

Re: Can MOB be enabled for existing table?

2018-09-20 Thread Jean-Marc Spaggiari
Hi Andrea, I will say that the easiest way to know is to try it... I'm guessing that it should work. When HBase will compact the file, it will figure that some fields are bigger than the configures MOB threshold and will move them under the MOB umbrella. But this has to be tested. Let us know.

Re: HMerge Status

2018-08-31 Thread Jean-Marc Spaggiari
not > > thousand merges. It's also challenging to find candidate pairs. > > > > -Austin > > > > > > On 08/30/2018 03:45 PM, Jean-Marc Spaggiari wrote: > >> Hi Austin, > >> > >> Which version are you using? Why not just using the she

Re: HMerge Status

2018-08-30 Thread Jean-Marc Spaggiari
g to find candidate pairs. > > -Austin > > > On 08/30/2018 03:45 PM, Jean-Marc Spaggiari wrote: > > Hi Austin, > > > > Which version are you using? Why not just using the shell merge command? > > > > JMS > > > > Le jeu. 30 août 2018 à 15:41,

Re: HMerge Status

2018-08-30 Thread Jean-Marc Spaggiari
Hi Austin, Which version are you using? Why not just using the shell merge command? JMS Le jeu. 30 août 2018 à 15:41, Austin Heyne a écrit : > We're currently sitting at a very high number of regions due to an > initially poor value for hbase.regionserver.regionSplitLimit and would > like to

Re: Compactions after bulk load

2018-07-17 Thread Jean-Marc Spaggiari
Hi Austin, Can you share your table description? Also,was the table empty? Last, what does your bulk data look like? I mean, how many files? One per region? Are you 100% sure? Have you used the HFile too to validate the splits and keys of your files? JMS 2018-07-17 14:12 GMT-04:00 Austin Heyne

Re: How to get the HDFS path for a given HBase table?

2018-04-20 Thread Jean-Marc Spaggiari
Hi Ming, Take a look at the FSUtils... There is plenty of very useful helpers there... JMS 2018-04-20 8:38 GMT-04:00 Ming : > Hello, > > > > I am trying to use Java API to get the HDFS path for a given table, but I > cannot find that method. > > For some version, I notice

Re: Integrating Hbase with Solr

2018-03-02 Thread Jean-Marc Spaggiari
Hi Nitin, Have you try google? There is many examples online. Here is one: https://www.cloudera.com/documentation/enterprise/5-5-x/topics/search_config_hbase_indexer_for_search.html It's old, but should still be correct. This one is a bit more recent and contains a morphilne example:

Re: Delete a CF paramter?

2018-01-01 Thread Jean-Marc Spaggiari
t; > HStoreFile storeFile = new HStoreFile(this.getFileSystem(), info, > this. > conf, this.cacheConf, > > this.family.getBloomFilterType(), isPrimaryReplicaStore()); > > > If you want to use a different bloom filter, you can issue this command: > > > alter 'test', { NAME

Delete a CF paramter?

2018-01-01 Thread Jean-Marc Spaggiari
Hi, What is the magic to delete a CF parameter? Like in this example: hbase(main):033:0> desc 'table' Table dns is ENABLED table COLUMN FAMILIES DESCRIPTION {NAME => '@', VERSIONS => '1', EVICT_BLOCKS_ON_CLOSE => 'false', NEW_VERSION_BEHAVIOR => 'false', KEEP_DELETED_CELLS =>

Re: How common is Hbase throttling?

2017-11-29 Thread Jean-Marc Spaggiari
Hi Sumit, I guess you got the chance to find this page? https://blogs.apache.org/hbase/entry/the_hbase_request_throttling_feature Not sure if it replies to part of your questions. JMS 2017-11-29 17:18 GMT-05:00 Huaxiang Sun : > Hi Sumit, > >Throttling is not common as

Re: Fast search by any column

2017-08-28 Thread Jean-Marc Spaggiari
Hi Andrzej, Index all your data into SOLR, make it return you the row key for each lookup? JMS 2017-08-28 14:51 GMT-04:00 Andrzej : > How add index to any column? >

Re: Awesome HBase - a curated list

2017-08-02 Thread Jean-Marc Spaggiari
Ha! I missed it! Thanks Anoop! JMS 2017-08-02 9:50 GMT-04:00 Anoop John <anoop.hb...@gmail.com>: > >What's about the HBase to SOLR Lily indexer? Works like a charm! > Already listed under " Secondary Indices" > > -Anoop- > > On Wed, Aug 2, 2017 at 5

Re: Awesome HBase - a curated list

2017-08-02 Thread Jean-Marc Spaggiari
What's about the HBase to SOLR Lily indexer? Works like a charm! Le 2 août 2017 7:50 AM, "Sanel Zukan" a écrit : > Few more integrations: > > * Apache Drill (https://drill.apache.org/docs/querying-hbase/) > * Apache Impala (https://impala.apache.org/index.html and >

Re: Setting TTL at the row level

2017-06-21 Thread Jean-Marc Spaggiari
Why not using the cell level ttl? Le 2017-06-21 2:35 PM, "Vladimir Rodionov" a écrit : > Should work > > On Wed, Jun 21, 2017 at 11:31 AM, wrote: > > > Hi all, > > > > I know it is possible to set TTL in HBase at the column family level - > > which

Re: Hbase indexer for SOLR

2017-06-08 Thread Jean-Marc Spaggiari
Hi Fred, Just run your own ZK instance and not the embedded one. It's pretty small and easy to start. Make sure SOLR is configured to use /SOLR in ZK and not / You can try to use "hbase zkcli" to see if you ZK server is running wall (and that HBase can talk to it well) JMS 2017-06-08 12:37

Re: hbase input split

2017-05-22 Thread Jean-Marc Spaggiari
Hi Rajesh, Not really. In HBase, data is ordered and stored based on the key. If you want to split by another field, HBase has no clues about the content and where to nicely split. So you will run a mapper on HBase splits, and your logic in a reducer... (Same logic with Spark) JMS 2017-05-22

Re: HBASE and MOB

2017-05-12 Thread Jean-Marc Spaggiari
eproduce this process : > http://blog.cloudera.com/blog/2015/10/how-to-index-scanned- > pdfs-at-scale-using-fewer-than-50-lines-of-code/ > > > But maybe is there another solution to reproduce it . > > Fred > > > > De : Jean-Marc S

Re: HBASE and MOB

2017-05-12 Thread Jean-Marc Spaggiari
Hi Fred, Can you please confirm the following information? 1) What exact version of HBase are you using? From a distribution, build by yourself, from the JARs, etc. 2) Why do you think you need the MOB feature 3) Is an upgrade an option for you or not really. Thanks, JMS 2017-05-12 11:02

Re: Scan and Get - different results

2017-03-02 Thread Jean-Marc Spaggiari
Can that not be causes by the filter? What scan gives you if you remove the filter? 2017-03-02 11:03 GMT-05:00 Ted Yu : > When you issue raw scan, what output do you get ? > > hbase> scan 't1', {RAW => true} > > BTW looks like you have row key 'status', I am bit curious

Re: why Hbase indexer required REPLICATION_SCOPE => '1'

2017-01-17 Thread Jean-Marc Spaggiari
Hi Manjeet, The HBase indexer hookup on the replication layer to carry the mutations over to SOLR. The impact is the same as for replication (Like, if you stop SOLR forever, the WALs will stay on HBase side, etc.) You can look at those links:

Re: High CPU utilization by meta region

2016-11-22 Thread Jean-Marc Spaggiari
To add to what Stack asked, do you have the metrics for your META vs he other regions? Is the meta hot-spotted, which might create an increase on the CPU usage? Not just the requests per seconds, but also the number of calls. Does the META have way more? Or almost the same? Or less? thanks, JMS

Re: Hbase full scan table data will be fast?

2016-11-18 Thread Jean-Marc Spaggiari
Hi, I will all depends of the number of columns, the performance of your servers, if you are doing the scan in parallel across all the regions at the same time, or not, the processing you will do, etc. So it is not possible to give you any estimate. You will have to test it to figure it. JMS

Re: [Query :] hbase rebalancing the data after adding new nodes in cluster

2016-10-20 Thread Jean-Marc Spaggiari
Hi Manjeet, Probably because your table is not really deleted. Can you "list" the tables to confirm? For the balacing, just run a major compaction of your table and locallity will come back. JMS 2016-10-20 6:58 GMT-04:00 Manjeet Singh : > I have deleted my table but

Re: java.lang.OutOfMemoryError when count hbase table

2016-10-19 Thread Jean-Marc Spaggiari
se-env.xml, now in hbase shell, count > runs well. > > But java client still crashes because : > > Caused by: com.google.protobuf.InvalidProtocolBufferException: Protocol > message was too large. May be malicious. Use > CodedInputStream.setSizeLimit() to increase the size limit. > > I've browsed apache's jira,

Re: java.lang.OutOfMemoryError when count hbase table

2016-10-19 Thread Jean-Marc Spaggiari
Interesting. Can you bump the client heap size? How much do you have for the client? JMS 2016-10-19 3:50 GMT-04:00 big data : > Dear all, > > I've a hbase table, one row has a huge keyvalue, about 100M size. > > When I execute count table in hbase shell, hbase crash to

Re: Maximum size of HBase row

2016-10-17 Thread Jean-Marc Spaggiari
gt; Thank you JMS ! The scenario of row size exceeding 1 GB is anticipated for > very few rows ( < 0.01 %). > > Hope I can continue with the wide table approach. Do you see a problem ? > > Regards, > Sreeram > > > > On Mon, Oct 17, 2016 at 4:44 PM, Jean-Marc Sp

Re: Maximum size of HBase row

2016-10-17 Thread Jean-Marc Spaggiari
Hi Sreeram., HBase will not split a region withing a row. So if a row gets WAY to many columns, its size can grow higher than the configured max region size. Which, of course, is not recommended because your region will serve a single row. If you think your row will become bigger than 1% or your

Re: Loading into hbase from csv file issue

2016-10-03 Thread Jean-Marc Spaggiari
rofile/view?id=AAEWh2gBxianrbJ > d6zP6AcPCCdOABUrV8Pw > <https://www.linkedin.com/profile/view?id=AAEWh2gBxianrb > Jd6zP6AcPCCdOABUrV8Pw>* > > > > http://talebzadehmich.wordpress.com > > > *Disclaimer:* Use it at your own risk. Any and all responsibility

Re: Loading into hbase from csv file issue

2016-10-03 Thread Jean-Marc Spaggiari
Hi Mich, As you said, it's most probably because it's all the same key... If you want to be 200% sure, just alter VERSIONS => '1' to be greater (like, 10) and scan all the versions of the cells. You should see the others. JMS 2016-10-03 3:41 GMT-04:00 Mich Talebzadeh

Re: Posted speakers and talks for the hbaseconeast meetup on September 26th

2016-09-12 Thread Jean-Marc Spaggiari
Works for me ;) Thanks, JMS 2016-09-12 16:15 GMT-04:00 Stack <st...@duboce.net>: > Smile. No. How about hbaseconeast? > > On Mon, Sep 12, 2016 at 10:46 AM, Jean-Marc Spaggiari < > jean-m...@spaggiari.org> wrote: > > > Hi St.Ack, > > > > D

Re: Posted speakers and talks for the hbaseconeast meetup on September 26th

2016-09-12 Thread Jean-Marc Spaggiari
Hi St.Ack, Do we have an hashtag to tweet about the event? JMS 2016-09-12 13:41 GMT-04:00 Stack : > Looking like some nice talks. See http://www.meetup.com/HBase-NYC/ > > Signup if you are out east and haven't done so already, > > St.Ack for Program Committee >

Re: Hbase table size with replicas

2016-08-25 Thread Jean-Marc Spaggiari
The it is most probably the size without replaction. to be 100% sure you can create a 1MB file using this command: dd if=/dev/zero of=output.file bs=1024 count=1024 Then push it in HDFS and see what tize it gives you... JMS 2016-08-25 10:24 GMT-04:00 marjana : > Hm I only

Re: Hbase table size with replicas

2016-08-25 Thread Jean-Marc Spaggiari
Hi Marjana, Depending on the HDFS version you have, you should see 2 numbers at the beginning of the line. Something like: 11.7 K 32.5K /user/myself The first number is without the HDFS replicates. The 2nd number is with the HDFS replicates. HTH, JMS 2016-08-25 9:58 GMT-04:00 marjana

Re: Is setting the start and endrow inclusive of the keys provided

2016-08-18 Thread Jean-Marc Spaggiari
Does this help? /** * Create a Scan operation for the range of rows specified. * @param startRow row to start scanner at or after (inclusive) * @param stopRow row to stop scanner before (exclusive) */ public Scan(byte [] startRow, byte [] stopRow) { 2016-08-18 3:43 GMT-04:00

Re: Hbase Row key lock

2016-08-17 Thread Jean-Marc Spaggiari
Hi Manjeet, Then have you looked at CheckandPut and the others ChecAnd* calls? JMS 2016-08-17 5:08 GMT-04:00 Manjeet Singh : > Hi Dima > > Thanks for your reply its useful for me, My concern is that I have very > frequent get/put opration using spark Hbase connector

Re: If hbase can trigger minor-compaction when it is doing major-compaction?

2016-08-17 Thread Jean-Marc Spaggiari
Hi, There is 2 reasons to have a major compaction. the first on is when a minor compaction selects all the files to be compacted. It is then promoted as a major compaction for that region. The second reason is time based. Every day or week, depending on your configuration, HBase will trigger a

Re: get first row of every region

2016-08-01 Thread Jean-Marc Spaggiari
Well, then it should return the row from the next region. So it might means that the last region is empty or the last X regions, no? Le 2016-08-01 7:52 PM, "Vladimir Rodionov" a écrit : > it means that for some regions you do not have any data. > > -Vlad > > On Mon,

Re: Delete row that has columns with future timestamp

2016-06-26 Thread Jean-Marc Spaggiari
Hi, This is a known issue and I think it is solved is more recent versions. Do you have the option to upgrade? JMS Le 2016-06-26 07:00, "M. BagherEsmaeily" a écrit : > these problem doesn't solve with major compact!! Assuming the problem is > solved with major compact,

Re: May I run hbase on top of Alluxio/tacyon

2016-06-20 Thread Jean-Marc Spaggiari
I think you might want to clean everything and retry. Clean the ZK /hbase content as well as your fs /hbase folder and restart... 2016-06-20 3:22 GMT-04:00 kevin : > *I got some error:* > > 2016-06-20 14:50:45,453 INFO [main] zookeeper.ZooKeeper: Client >

Re: hbase uniformsplit for non hex keys

2016-05-31 Thread Jean-Marc Spaggiari
Hi Shushant, There is currently only 2 possible values for it. UniformSplit and HexStringSplit . UniformSplit will evenly split the regions using binary keys. Therefore, values might vary from 0x00 to 0xFF. On the other side, HexStringSplit will create regions using hexadecimal strings values

Re: Retiring empty regions

2016-04-01 Thread Jean-Marc Spaggiari
;) That was not the question ;) So Nick, merge on 1.1 is not recommended??? Was working very well on previous versions. Is ProcV2 really impact it that bad?? JMS 2016-04-01 13:49 GMT-04:00 Vladimir Rodionov : > >> This is something > >> which makes it far less useful

Re: is it a good idea to disable tables not currently hot?

2016-03-19 Thread Jean-Marc Spaggiari
of information from the > web console. > Might be interesting to look on the log side next time you face that... JMS > > > Also, have you think about eventually merging some of the tables > together? > > Haven't thought about it. I might go there if no other options. >

Re: why Hbase only split regions in one RegionServer

2016-03-19 Thread Jean-Marc Spaggiari
t of balance. > > We should help Jack find out the cause for imbalance of regions. > > On Wed, Mar 16, 2016 at 4:17 AM, Jean-Marc Spaggiari < > jean-m...@spaggiari.org> wrote: > > > +1 with what Heng said. I think we should just deprecate the ability to > not > >

Re: How to implement increment in an idempotent manner

2016-03-19 Thread Jean-Marc Spaggiari
base table. > > On Fri, Mar 18, 2016 at 3:30 PM, Jean-Marc Spaggiari < > jean-m...@spaggiari.org> wrote: > > > At the beginning of your Storm bolt process can you not do a put of "0"? > So > > it start back from scratch? Or else you will need to query the v

Re: is it a good idea to disable tables not currently hot?

2016-03-19 Thread Jean-Marc Spaggiari
d is resolved. > > > > > > The second issue, which I have battled with for two years now, is > > > that I am doing online puts, which occasionally triggers compacts > > > when a region is heavily inserted, and whenever it happens, all > > > subsequent read/wri

Re: why Hbase only split regions in one RegionServer

2016-03-19 Thread Jean-Marc Spaggiari
it. JMS 2016-03-16 10:54 GMT-04:00 Dave Latham <lat...@davelink.net>: > What if someone doesn't know the distribution of their row keys? > HBase should be able to handle this case. > > On Wed, Mar 16, 2016 at 7:18 AM, Jean-Marc Spaggiari < > jean-m...@spaggiari.org

Re: How to implement increment in an idempotent manner

2016-03-19 Thread Jean-Marc Spaggiari
At the beginning of your Storm bolt process can you not do a put of "0"? So it start back from scratch? Or else you will need to query the value, and keep the value to put it back if you need to replay your bolt Other option is, you increment a specific difference column and at the end if you are

Re: is it a good idea to disable tables not currently hot?

2016-03-18 Thread Jean-Marc Spaggiari
> all on hold and I can see time out error on the client side. A typical > compact runs for 4 minutes now and I have to increase timeout on a number > of places to accommodate that. So if I increase the size to 10 GB, will > compact time double? > > -Original Message- >

Re: is it a good idea to disable tables not currently hot?

2016-03-18 Thread Jean-Marc Spaggiari
attached. > > > > *From:* Jean-Marc Spaggiari [mailto:jean-m...@spaggiari.org] > *Sent:* Friday, March 18, 2016 3:33 PM > *To:* user > *Cc:* Frank Luo > > *Subject:* Re: is it a good idea to disable tables not currently hot? > > > > Indeed ;) Frank if you can past

Re: is it a good idea to disable tables not currently hot?

2016-03-18 Thread Jean-Marc Spaggiari
t; > *From:* Frank Luo > *Sent:* Friday, March 18, 2016 5:11 PM > *To:* 'Jean-Marc Spaggiari'; user > *Subject:* RE: is it a good idea to disable tables not currently hot? > > > > Config attached. > > > > *From:* Jean-Marc Spaggiari [mailto:jean-m...@spaggiari.org &

Re: is it a good idea to disable tables not currently hot?

2016-03-18 Thread Jean-Marc Spaggiari
Hi Frank, It might be doable. What HBase version are you running? JMS 2016-03-18 12:25 GMT-04:00 Frank Luo : > No one has experience disabling tables? > > -Original Message- > From: Frank Luo [mailto:j...@merkleinc.com] > Sent: Thursday, March 17, 2016 4:51 PM >

Re: is it a good idea to disable tables not currently hot?

2016-03-18 Thread Jean-Marc Spaggiari
Frank listed. > > On Fri, Mar 18, 2016 at 12:36 PM, Jean-Marc Spaggiari < > jean-m...@spaggiari.org> wrote: > > > By default memsotre is 40%. Here it's 24%. There is a lot you might want > to > > look at on your cluster and usecase :( > > > > 1) You

Re: why Hbase only split regions in one RegionServer

2016-03-16 Thread Jean-Marc Spaggiari
+1 with what Heng said. I think we should just deprecate the ability to not pre-split a table ;) It's always good to pre-split it based on your key design... 2016-03-16 0:17 GMT-04:00 Heng Chen : > bq. the table I created by default having only one region > > Why not

Re: How to Intgrate HBase With SparkStreaming

2016-03-10 Thread Jean-Marc Spaggiari
Have you looked and tried this? https://hbase.apache.org/book.html#_spark_streaming It doesn't work for you? JMS 2016-03-10 9:36 GMT-05:00 Rachana Srivastava < rachanasrivas...@yahoo.com.invalid>: > Hello all, > I am trying to integrate HBase with SparkStreaming new APIs mentioned here >

Re: Spark on Hbase

2016-03-09 Thread Jean-Marc Spaggiari
Rachana, is that a dev environment? Do you hve a hard constraint to CDH 5.5? If no, can you try to pull CDH 5.7 instead? It's not yet released but some snapshots seems to be available and spark works pretty fine with it. I just build few examples successfully. HTH. JMS 2016-03-09 14:00

Re: HBase : Transaction queries

2016-02-17 Thread Jean-Marc Spaggiari
Hi Divya, HBase doesn't support transactions like rollback and commits. It has row level atomicity only. Some frameworks have been build on top of HBase to provide this kind of features but with a big impact on the performances. You can take a look at (And I might be missing some): - Themis -

Re: Inserting different versions for a KeyValue through 2 HFiles while bulkloading

2016-02-16 Thread Jean-Marc Spaggiari
Hi Mehdi, HBase wil sort the KeyValues based on the timestamp. So if you keep just one version only the last one will be returned. If you keep more than one and ask them all of them, then you will get both, ordered by timestamp. HTH JMS 2016-02-16 5:19 GMT-05:00 Mehdi Ben Haj Abbes

Re: Fuzzy Row filter with ranges

2016-02-05 Thread Jean-Marc Spaggiari
Hi, Be careful with you stop key here, you might miss some values. Consider prefix 0xAA. Start=0xAA Stop=0xAA 0xFF You will miss 0xAA 0xFF 0xFF which is still a valid option. JMS Le 2016-02-05 3:21 AM, "Jameson Li" <hovlj...@gmail.com> a écrit : > 2016-02-01 19:08

Re: Fuzzy Row filter with ranges

2016-02-01 Thread Jean-Marc Spaggiari
Can't you not just add start row and stop row to you scan? Can you provide an example of what you are trying to do? JMS 2016-02-01 0:11 GMT-05:00 hongbin ma : > i suggest you reading the current implementation of fuzzy row filter, and > modify it according to your

Re: Flat-wide table Hbase

2015-12-17 Thread Jean-Marc Spaggiari
umn family name, qualifier name ,value > p.add(Bytes.toBytes("test_family"), > Bytes.toBytes(colno + ColumnNames[index]), > Bytes.toBytes(colvalues[index])); > > table1.put(p); > > Put col = new Put(Bytes.toBytes(colval

Re: Flat-wide table Hbase

2015-12-14 Thread Jean-Marc Spaggiari
on I need to retrieve data from hbase table within one or > two seconds. I have tried as you suggested which may lead to 1000 rows for > a given id which takes more than a minute in retrieval process. > > Thanks > Rajeshkumar > > On Mon, Dec 14, 2015 at 3:29 PM, Jean-Marc Sp

Re: Flat-wide table Hbase

2015-12-14 Thread Jean-Marc Spaggiari
Hi, HBase is a key value sotre. So what you are pushing here will be stored as: 1002 | xxx | www.sample.com | xx:xx:xx 1003 | yyy | www.url,com | xx:xx:yy 1002 | xxx | url.com | yy:yy:yy 1002 | xxx | urrl2.com | zz:zz:zz HOWEVER HBase will never split a region withing a key and keys are

Re: Flat-wide table Hbase

2015-12-14 Thread Jean-Marc Spaggiari
inserted as > rowkey and url.com will be one of my column qualifier > > what happens when I try to insert next row i.e., 1002 | xxx | urrl2.com | > zz:zz:zz for this also row-key will be 1002-xxx. As far as I know when we > try to insert same row-key the row will be updated.

Re: Flat-wide table Hbase

2015-12-14 Thread Jean-Marc Spaggiari
ur reply inserting second row will not update the existing > row-key and it will add as new column qualifiers to the existing row-key > > Thanks > > On Mon, Dec 14, 2015 at 4:13 PM, Jean-Marc Spaggiari < > jean-m...@spaggiari.org> wrote: > > > Hi, > > &

Re: Flat-wide table Hbase

2015-12-14 Thread Jean-Marc Spaggiari
s is what I need and I am considering this as flat-wide table > approach. > >I have some doubts and first of them is how to create dynamic column > qualifiers. Do you know the command or any other sites which is useful for > this approach. > > Thanks > > On Mon, Dec

Re: unsubscribe me please from this mailing list

2015-12-04 Thread Jean-Marc Spaggiari
Hi all, The same way you susbscribed on your own, you will have to un-subscribe on your own too... https://hbase.apache.org/mail-lists.html JMS 2015-12-04 5:56 GMT-05:00 Dinu Sweet : > Thanks for the information provided so far. > Please unsubscribe me too from this

Re: Am I crazy or that's not that much?

2015-12-01 Thread Jean-Marc Spaggiari
I can not say if you are crazy or not. Only you know ;) Now, regarding the number of columns... it depends... If you want to store 800 000 1MB columns, it's almost 800GB for one region. Forget that! HBase will not split within a row. So you will kill you RS with a that big region. But if you want

Re: timestamp/ttl of a cell

2015-11-25 Thread Jean-Marc Spaggiari
This? HBASE-10560 2015-11-25 6:45 GMT-05:00 Shushant Arora : > Hi > > Can TTL of rows be set/updated instead of complete column family? > or > Can timestamp version of a cell be decreased ? Aim is to delete some rows > whose timestamp > is set to old values so that it

Re: timestamp/ttl of a cell

2015-11-25 Thread Jean-Marc Spaggiari
ell and java ? > > On Wed, Nov 25, 2015 at 6:05 PM, Jean-Marc Spaggiari < > jean-m...@spaggiari.org> wrote: > > > This? HBASE-10560 > > > > 2015-11-25 6:45 GMT-05:00 Shushant Arora <shushantaror...@gmail.com>: > > > > > Hi > >

Re: hbase timerange scan

2015-11-05 Thread Jean-Marc Spaggiari
Hi Shushant, If my memory deserves me correctly, someone asked the same question about a year ago, and one of the commiters looked into the code to figure that there is no mechanism to skip some of the files even when timestamp is provided. It ended up being a full table scan. You might be able

Re: hbase timerange scan

2015-11-05 Thread Jean-Marc Spaggiari
les if the time range for a scan does not intersect with the time > range of data in the store file. However, without tiered compaction there > is nothing built in to optimize grouping of data into store files by time > range for efficiency in time range scans. > > On Thu, Nov 5, 2015 a

Re: Slow reads coinciding with higher compaction time avg time

2015-11-02 Thread Jean-Marc Spaggiari
How much memory do you have on this server, what is running and how much did you give to what? Also, what is your swapiness level? http://askubuntu.com/questions/103915/how-do-i-configure-swappiness 2015-11-02 19:20 GMT-05:00 Girish Joshi : > Thanks. Do you have any

Re: Does adding new columns cause compaction storm?

2015-10-13 Thread Jean-Marc Spaggiari
It depends ;) If the added column trigger a flush, this flush might trigger a compaction ;) But it will be the exact same thing with an existing column. It's not because it's a new column that it will trigger a compaction. Any mutation command might trigger a flush then a compaction. What ever

Re: alter column family - possible operational impacts on big tables

2015-10-09 Thread Jean-Marc Spaggiari
dswizz.com> wrote: > > > > Hi, > > > > Indeed, we have tables with 1-5000 regions, distributed on 10-15 RSs. > > > > A few hours are sufficient to do the alter one a single such table, > right? > > > > Thanks, > > Nicu > > > >

Re: alter column family - possible operational impacts on big tables

2015-10-08 Thread Jean-Marc Spaggiari
Hi Nicu, Indeed, with 0.94 you have to disable the table before doing the alter. However, for 30 regions, it should be pretty fast. When you say 30+, are you talking about like 1K regions? Or more like 32? The alter will only update the meta table, so not that much impact on the servers. And no

Re: Hbase import/export change number of rows

2015-09-22 Thread Jean-Marc Spaggiari
Very interesting. Are you able to figure which rows are missing? What version of HBase are you using? How big is your table? What does the 2 Export and Import tools report? Is the ingestion stopped while doing the export/import sequence? Can you reproduce that every time? Thanks, JM 2015-09-22

Re: How to detect whether hbase cluster is up and ready for accepting requests.

2015-09-02 Thread Jean-Marc Spaggiari
I guess you can also use something like this: public class TestInstallation { private static final Log LOG = LogFactory.getLog(TestInstallation.class); public static void main(String[] args) { Configuration conf = HBaseConfiguration.create(); try { LOG.info("Testing HBase

Re: Access cell tags from HBase shell

2015-08-31 Thread Jean-Marc Spaggiari
But I don't think you can retrieve the list of labels for a given cell, right? Cells are only interpreted server side and are not returned on the client side... 2015-08-31 15:52 GMT-04:00 Ted Yu : > From the help message of put command, you can see the following: > >

Re: hbase pre-split

2015-08-27 Thread Jean-Marc Spaggiari
Jackie, tell us more... What do you want to know? How many ways you have to split the table? Like from the shell, from the web UI, from the Java API, etc.? Or how many ways to decide how many splits you want to have at the beginning, at the end, etc.? JM 2015-08-26 21:42 GMT-04:00 Ted Yu

Re: optimal size for Hbase.hregion.memstore.flush.size and its impact

2015-08-24 Thread Jean-Marc Spaggiari
The split policy also uses the flush size to estimate how to split tables... It's sometime fine to upgrade thise number a bit. Like, to 256MB. But 512 is pretty high And 800MB is even more. Big memstores takes more time to get flush and can block the writes if they are not fast enough. If

Re: Replicating MySQL table to HBase

2015-08-14 Thread Jean-Marc Spaggiari
Hi, Before even going into that direction, why do you want to do that? It's most probably not a good idea. Is is for backup? For replication? etc. JM 2015-08-14 19:56 GMT-04:00 Buntu Dev buntu...@gmail.com: I'm looking for ways to setup an incremental update task to replicate the MySQL

Re: Replicating MySQL table to HBase

2015-08-14 Thread Jean-Marc Spaggiari
analysis on those which requires some meta data only available in MySQL. We could do a one time Sqoop and then want to setup a job to capture the changes and write to HBase. I'm looking for options to handle the MySQL changes, thanks! On Fri, Aug 14, 2015 at 5:00 PM, Jean-Marc Spaggiari jean-m

Re: Flushs and compactions

2015-08-06 Thread Jean-Marc Spaggiari
this and I let you know JMS. esteban. -- Cloudera, Inc. On Thu, Aug 6, 2015 at 7:35 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org wrote: Nothing on the logs... http://pastebin.com/4U6Fmxt9 Running standalone. Kept the server running for 1h after I tried

  1   2   3   4   5   6   7   8   9   10   >