Question about replacing files and about Publishing Jars

2019-02-25 Thread Arvind Pruthi
Hello There, Q1. What happens In case a file is deleted and a new file is to be added with the same name, but the snapshot in which the delete was registered is still around? There is no ambiguity from listing the manifest entries point of view. However, there will be ambiguity at the Hdfs leve

Re: Iceberg fails to return results when filtered on complex columns ..

2019-02-25 Thread Ryan Blue
Just to follow up on this for anyone watching the dev list. I merged Gautam's fix for this. Thanks for finding and working on this! On Thu, Feb 21, 2019 at 4:09 AM Gautam wrote: > > Hey Ryan, > > I found the root cause of the post scan filter not working over Iceberg > format. > > *The short exp

Re: Developing a "dataset" API / framework for Arrow C++ users

2019-02-25 Thread Wes McKinney
hi Joel and Uwe, yes, feedback from the Iceberg community would be useful about what kinds of APIs are required to be able to interact well with table formats like Iceberg. As Uwe says, the objective of the C++ code I am proposing to develop is to have appropriate C++ APIs for interacting with dif

Re: Developing a "dataset" API / framework for Arrow C++ users

2019-02-25 Thread Uwe L. Korn
Hello, this should definitely be shared with the Apache Iceberg community (cc'ed). The title of the document may be a bit confusing. What is proposed in there is actually constructing the building blocks in C++ that are required for supporting Python/C++/.. implementations for things like Icebe

Re: Would we consider adding support for metrics collection/tracing instrumentation such as opencensus or opentracing?

2019-02-25 Thread filip
+1 on the distributed tracing, no obvious integration points. Dropwizard metrics should suffice wrt to functional requirements, after all it does work for Spark [1], right? Wrt to your ask on choosing and established and reasonable dependencies set dependency I think Dropwizard is the only option,