Re: [DISCUSS] USING syntax for Datasource V2

2018-08-21 Thread Russell Spitzer
So the external catalogue is really more of a connector to different source of truth fo table listings? That makes more sense. On Tue, Aug 21, 2018 at 6:16 PM Ryan Blue wrote: > I don’t understand why a Cassandra Catalogue wouldn’t be able to store > metadata references for a parquet table just

Re: Persisting driver logs in yarn client mode (SPARK-25118)

2018-08-21 Thread Saisai Shao
One issue I can think of is that this "moving the driver log" in the application end is quite time-consuming, which will significantly delay the shutdown. We already suffered such "rename" problem for event log on object store, the moving of driver log will make the problem severe. For a vanilla S

Re: [discuss][minor] impending python 3.x jenkins upgrade... 3.5.x? 3.6.x?

2018-08-21 Thread shane knapp
i'll start my testing w/3.6 tomorrow. On Mon, Aug 20, 2018 at 1:30 PM, Li Jin wrote: > Thanks for looking into this Shane. If we can only have a single python 3 > version, I agree 3.6 would be better than 3.5. Otherwise, ideally I think > it would be nice to test all supported 3.x versions (late

Re: [DISCUSS] USING syntax for Datasource V2

2018-08-21 Thread Ryan Blue
I don’t understand why a Cassandra Catalogue wouldn’t be able to store metadata references for a parquet table just as a Hive Catalogue can store references to a C* datastource. Sorry for the confusion. I’m not talking about a catalog that stores its information in Cassandra. I’m talking about a c

Re: Spark DataFrame UNPIVOT feature

2018-08-21 Thread Reynold Xin
Probably just because it is not used that often and nobody has submitted a patch for it. I've used pivot probably on average once a week (primarily in spreadsheets), but I've never used unpivot ... On Tue, Aug 21, 2018 at 3:06 PM Ivan Gozali wrote: > Hi there, > > I was looking into why the UNP

Spark DataFrame UNPIVOT feature

2018-08-21 Thread Ivan Gozali
Hi there, I was looking into why the UNPIVOT feature isn't implemented, given that Spark already has PIVOT implemented natively in the DataFrame/Dataset API. Came across this JIRA which talks about implementing PIVOT in Spark 1.6, but no mention

Persisting driver logs in yarn client mode (SPARK-25118)

2018-08-21 Thread Ankur Gupta
Hi all, I want to highlight a problem that we face here at Cloudera and start a discussion on how to go about solving it. *Problem Statement:* Our customers reach out to us when they face problems in their Spark Applications. Those problems can be related to Spark, environment issues, their own c