Re: [DISCUSS] SPIP: Standardize SQL logical plans

2018-07-12 Thread Ryan Blue
Thanks! I'm all for calling a vote on the SPIP. If I understand the process correctly, the intent is for a "shepherd" to do it. I'm happy to call a vote, or feel free if you'd like to play that role. Other comments: * DeleteData API: I completely agree that we need to have a proposal for it. I

Re: Spark model serving

2018-07-12 Thread Saikat Kanjilal
Thanks maximiliano so much for responding, I didn't want this discussion to disappear in the wilderness of dev emails :), here's what I would like to see or contribute to for model serving within spark, first of I want to be clear on what we mean by model serving so I'll add my interpretation

Re: Revisiting Online serving of Spark models?

2018-07-12 Thread Maximiliano Felice
Hi! To keep things ordered, I just sent an update on an older email requesting an update for this, named: Spark model serving. I propose to follow the discussion there. Or here, but not to branch. Bye! El mar., 3 jul. 2018 a las 22:15, Matei Zaharia () escribió: > Just wondering, is there an

Re: Spark model serving

2018-07-12 Thread Maximiliano Felice
Hi, As I know many of you don't read / are not part of the user list. I'll make a summary of what happened at the summit: We discussed some needs we get in order to start serving our predictions with Spark. We mostly talked about alternatives to this work and what we could expect in these areas.

Re: Creating JDBC source table schema(DDL) dynamically

2018-07-12 Thread Kadam, Gangadhar (GE Aviation, Non-GE)
Ok. Thanks. On 7/12/18, 11:12 AM, "Thakrar, Jayesh" wrote: Unless the tables are very small (< 1000 rows), the impact of hitting the catalog tables is negligible. Furthermore, normally the catalog tables (or views) are usually in memory because they are needed for query compilation,

Re: Creating JDBC source table schema(DDL) dynamically

2018-07-12 Thread Thakrar, Jayesh
Unless the tables are very small (< 1000 rows), the impact of hitting the catalog tables is negligible. Furthermore, normally the catalog tables (or views) are usually in memory because they are needed for query compilation, query execution (for triggers, referential integrity, etc) and even to

Re: Creating JDBC source table schema(DDL) dynamically

2018-07-12 Thread Kadam, Gangadhar (GE Aviation, Non-GE)
Thanks Jayesh. I was aware of the catalog table approach but I was avoiding that because I will hit the database twice for one table, one to create DDL and other to read the data. I have lots of table to transport from one environment to other and I don’t want to create unnecessary load on

Re: Creating JDBC source table schema(DDL) dynamically

2018-07-12 Thread Thakrar, Jayesh
One option is to use plain JDBC to interrogate Postgresql catalog for the source table and generate the DDL to create the destination table. Then using plain JDBC again, create the table at the destination. See the link below for some pointers…..

Re: Asking for reviewing PRs regarding structured streaming

2018-07-12 Thread Jungtaek Lim
I recently added more test results to SPARK-24763 [1] which shows that the proposal reduces state size according to the ratio of key-value size, whereas there's no performance hit and sometimes even slight boost. Please refer the latest comment in JIRA issue [2] to see the numbers from perf.