Re: Spark on Kudu

2016-05-18 Thread Chris George
://gerrit.cloudera.org:8080/#/c/2992/5/docs/developing.adoc -Chris George On 5/18/16, 9:45 AM, "Benjamin Kim" mailto:bbuil...@gmail.com>> wrote: Can someone tell me what the state is of this Spark work? Also, does anyone have any sample code on how to update/insert data in Kudu

Re: Sparse Data

2016-05-12 Thread Chris George
I've used kudu with an EAV model for sparse data and that worked extremely well for us with billions of rows and the correct partitioning. -Chris On 5/12/16, 3:21 PM, "Dan Burkert" mailto:d...@cloudera.com>> wrote: Hi Ben, Kudu doesn't support sparse datasets with many columns very well. Kudu

Re: best practices to remove/retire data

2016-05-12 Thread Chris George
How hard would a predicate based delete be? Ie ScanDelete or something. -Chris George On 5/12/16, 9:24 AM, "Jean-Daniel Cryans" mailto:jdcry...@apache.org>> wrote: Hi, Right now this use case is more difficult than it needs to be. In your previous thread, "Partition and S

Re: why boolean type mapping is missing in Spark datasource

2016-04-25 Thread Chris George
Neglected probably. I'll ad it in. Sent using CloudMagic Email On Mon, Apr 25, 2016 at 7:30 PM, Darren Hoo mailto:darren@gmail.com>> wrote: I was looking at Chris George's improved Spark DataSource implemen

Re: Spark on Kudu

2016-04-13 Thread Chris George
f it were to be implemented. MERGE INTO table_name USING table_reference ON (condition) WHEN MATCHED THEN UPDATE SET column1 = value1 [, column2 = value2 ...] WHEN NOT MATCHED THEN INSERT (column1 [, column2 ...]) VALUES (value1 [, value2 …]) Cheers, Ben On Apr 11, 2016, at 12:21

Re: Spark on Kudu

2016-04-11 Thread Chris George
functionality. -Chris George On 4/11/16, 12:22 PM, "Jean-Daniel Cryans" mailto:jdcry...@apache.org>> wrote: You guys make a convincing point, although on the upsert side we'll need more support from the servers. Right now all you can do is an INSERT then, if you get a dup