HiveContext.refreshTable() missing in spark 2.0

2016-05-13 Thread 汪洋
Hi all, I notice that HiveContext used to have a refreshTable() method, but it doesn’t in branch-2.0. Do we drop that intentionally? If yes, how do we achieve similar functionality? Thanks. Yang

Re: [discuss] separate API annotation into two components: InterfaceAudience & InterfaceStability

2016-05-13 Thread Steve Loughran
> On 12 May 2016, at 22:29, Reynold Xin wrote: > > We currently have three levels of interface annotation: > > - unannotated: stable public API > - DeveloperApi: A lower-level, unstable API intended for developers. > - Experimental: An experimental user-facing API. > > > After using this anno

Re: [discuss] separate API annotation into two components: InterfaceAudience & InterfaceStability

2016-05-13 Thread Tom Graves
So we definitely need to be careful here.  I know you didn't mention it but it mentioned by others so I would not recommend using LimitedPrivate.  I had started a discussion on Hadoop about some of this due to the way Spark needed to use some of the Api's.https://issues.apache.org/jira/browse/HA

Re: [discuss] separate API annotation into two components: InterfaceAudience & InterfaceStability

2016-05-13 Thread Sean Busbey
On Fri, May 13, 2016 at 6:37 AM, Tom Graves wrote: > So we definitely need to be careful here. I know you didn't mention it but > it mentioned by others so I would not recommend using LimitedPrivate. I had > started a discussion on Hadoop about some of this due to the way Spark > needed to use s

Re: [discuss] separate API annotation into two components: InterfaceAudience & InterfaceStability

2016-05-13 Thread Marcelo Vanzin
On Fri, May 13, 2016 at 10:18 AM, Sean Busbey wrote: > I think LimitedPrivate gets a bad rap due to the way it is misused in > Hadoop. The use case here -- "we offer this to developers of > intermediate layers; those willing to update their software as we > update ours" I think "LimitedPrivate" i

Re: Shrinking the DataFrame lineage

2016-05-13 Thread Joseph Bradley
Here's a JIRA for it: https://issues.apache.org/jira/browse/SPARK-13346 I don't have a great method currently, but hacks can get around it: convert the DataFrame to an RDD and back to truncate the query plan lineage. Joseph On Wed, May 11, 2016 at 12:46 PM, Ulanov, Alexander < alexander.ula...@h

Re: [discuss] separate API annotation into two components: InterfaceAudience & InterfaceStability

2016-05-13 Thread Michael Armbrust
+1 to the general structure of Reynold's proposal. I've found what we do currently a little confusing. In particular, it doesn't make much sense that @DeveloperApi things are always labeled as possibly changing. For example the Data Source API should arguably be one of the most stable interfaces

RE: Shrinking the DataFrame lineage

2016-05-13 Thread Ulanov, Alexander
Hi Joseph, Thank you for the link! Two follow up questions 1)Suppose I have the original DataFrame in Tungsen, i.e. catalyst types and cached in off-heap store. It might be quite useful for iterative workloads due to lower GC overhead. Then I convert it to RDD and then backto DF. Will the resul