>>> Is Accumulo able to import these files, considering that they are two 
>>> different locality groups 

Yes. 

>>> without triggering a huge major compaction? 

Depends on your table.compaction.major.ratio and table.file.max settings. 


Sorry, not a real answer, but I think the answer is "it depends" 

----- Original Message -----

From: "Mario Pastorelli" <[email protected]> 
To: [email protected] 
Sent: Friday, October 28, 2016 9:37:13 AM 
Subject: Bulk ingestion of different locality groups at different times 

Hi, 

I have a question about using bulk ingestion for a rather special case. Let's 
say that I have the locality groups A and B. The values of each locality group 
are written to Accumulo in at different times, which means that first we ingest 
all the cells of the group A and then of B. We use Spark to ingest those 
records. Right now we write all the values with a custom writer but we would 
like to create the rfiles directly with Spark. In the case above, we would have 
two jobs creating the rfiles for the two distinct locality groups. Is Accumulo 
able to import these files, considering that they are two different locality 
groups, without triggering a huge major compaction? If not, what strategy would 
you suggest for the above use case? 

Thanks, 
Mario 

-- 
Mario Pastorelli | TERA LYTICS 


software engineer 

Teralytics AG | Zollstrasse 62 | 8005 Zurich | Switzerland 
phone: +41794381682 
email: [email protected] 
www.teralytics.net 


Company registration number: CH-020.3.037.709-7 | Trade register Canton Zurich 
Board of directors: Georg Polzer, Luciano Franceschina, Mark Schmitz, Yann de 
Vries 

This e-mail message contains confidential information which is for the sole 
attention and use of the intended recipient. Please notify us at once if you 
think that it may not be intended for you and delete it immediately. 

Reply via email to