Re: Spark Multiple Hive Metastore Catalog Support

2023-04-17 Thread Elliot West
Hi Ankit, While not a part of Spark, there is a project called 'WaggleDance' that can federate multiple Hive metastores so that they are accessible via a single URI: https://github.com/ExpediaGroup/waggle-dance This may be useful or perhaps serve as inspiration. Thanks, Elliot. On Mon, 17 Apr

Re: unit testing in spark

2017-04-11 Thread Elliot West
Jörn, I'm interested in your point on coverage. Coverage has been a useful tool for highlighting areas in the codebase that pose a source of potential risk. However, generally speaking, I've found that traditional coverage tools do not provide useful information when applied to distributed data

Re: Spark support for update/delete operations on Hive ORC transactional tables

2016-06-02 Thread Elliot West
Related to this, there exists an API in Hive to simplify the integrations of other frameworks with Hive's ACID feature: See: https://cwiki.apache.org/confluence/display/Hive/HCatalog+Streaming+Mutation+API It contains code for maintaining heartbeats, handling locks and transactions, and

Re: Spark integration with HCatalog (specifically regarding partitions)

2016-01-28 Thread Elliot West
, Elliot West <tea...@gmail.com> wrote: > Thanks for your response Jorge and apologies for my delay in replying. I > took your advice with case 5 and declared the column names explicitly > instead of the wildcard. This did the trick and I can now add partitions to > an existing t

Re: Spark integration with HCatalog (specifically regarding partitions)

2016-01-25 Thread Elliot West
quot;xxx", 1), ("yyy", 2); > > > > hive (default)> insert into table new_record_source > > values (3, "zzz"); > > > > Regards > > > On 11/01/2016, at 13:36, Elliot West <tea...@gmail.com> wrote: > > He

Re: Spark SQL -Hive transactions support

2016-01-19 Thread Elliot West
Hive's ACID feature (which introduces transactions) is not required for inserts, only updates and deletes. Inserts should be supported on a vanilla Hive shell. I'm not sure how Spark interacts with Hive in that regard but perhaps the HiveSQLContext implementation is lacking support. On a separate

Spark integration with HCatalog (specifically regarding partitions)

2016-01-11 Thread Elliot West
Hello, I am in the process of evaluating Spark (1.5.2) for a wide range of use cases. In particular I'm keen to understand the depth of the integration with HCatalog (aka the Hive Metastore). I am very encouraged when browsing the source contained within the org.apache.spark.sql.hive package. My