Re: Spark Dataframe and HIVE

2018-02-10 Thread रविशंकर नायर
Hi, Here you go: hive> show create table mine; OK CREATE TABLE `mine`( `policyid` int, `statecode` string, `socialid` string, `county` string, `eq_site_limit` decimal(10,2), `hu_site_limit` decimal(10,2), `fl_site_limit` decimal(10,2), `fr_site_limit` decimal(10,2), `tiv_2014`

Re: Spark Dataframe and HIVE

2018-02-10 Thread Shmuel Blitz
Please run the following command, and paste the result: SHOW CREATE TABLE <> On Sun, Feb 11, 2018 at 7:56 AM, ☼ R Nair (रविशंकर नायर) < ravishankar.n...@gmail.com> wrote: > No, No luck. > > Thanks > > On Sun, Feb 11, 2018 at 12:48 AM, Deepak Sharma > wrote: > >> In hive

Re: Spark Dataframe and HIVE

2018-02-10 Thread रविशंकर नायर
No, No luck. Thanks On Sun, Feb 11, 2018 at 12:48 AM, Deepak Sharma wrote: > In hive cli: > msck repair table 《table_name》; > > Thanks > Deepak > > On Feb 11, 2018 11:14, "☼ R Nair (रविशंकर नायर)" < > ravishankar.n...@gmail.com> wrote: > >> NO, can you pease explain the

Re: Spark Dataframe and HIVE

2018-02-10 Thread Deepak Sharma
In hive cli: msck repair table 《table_name》; Thanks Deepak On Feb 11, 2018 11:14, "☼ R Nair (रविशंकर नायर)" wrote: > NO, can you pease explain the command ? Let me try now. > > Best, > > On Sun, Feb 11, 2018 at 12:40 AM, Deepak Sharma >

Re: Spark Dataframe and HIVE

2018-02-10 Thread रविशंकर नायर
NO, can you pease explain the command ? Let me try now. Best, On Sun, Feb 11, 2018 at 12:40 AM, Deepak Sharma wrote: > I am not sure about the exact issue bjt i see you are partioning while > writing from spark. > Did you tried msck repair on the table before reading it

Re: Spark Dataframe and HIVE

2018-02-10 Thread Deepak Sharma
I am not sure about the exact issue bjt i see you are partioning while writing from spark. Did you tried msck repair on the table before reading it in hive ? Thanks Deepak On Feb 11, 2018 11:06, "☼ R Nair (रविशंकर नायर)" wrote: > All, > > Thanks for the inputs.

Re: Spark Dataframe and HIVE

2018-02-10 Thread रविशंकर नायर
All, Thanks for the inputs. Again I am not successful. I think, we need to resolve this, as this is a very common requirement. Please go through my complete code: STEP 1: Started Spark shell as spark-shell --master yarn STEP 2: Flowing code is being given as inout to shark shell import

Spark cannot find tables in Oracle database

2018-02-10 Thread Lian Jiang
Hi, I am following https://spark.apache.org/docs/latest/sql-programming-guide.html#jdbc-to-other-databases to query oracle database 12.1 from spark shell 2.11.8. val jdbcDF = spark.read .format("jdbc") .option("url", "jdbc:oracle:thin:@(DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST =

Re: Apache Spark - Structured Streaming - Updating UDF state dynamically at run time

2018-02-10 Thread M Singh
Just checking if anyone has any pointers for dynamically updating query state in structured streaming. Thanks On Thursday, February 8, 2018 2:58 PM, M Singh wrote: Hi Spark Experts: I am trying to use a stateful udf with spark structured streaming that

Apache Spark - Structured Streaming Query Status - field descriptions

2018-02-10 Thread M Singh
Hi: I am working with spark 2.2.0 and am looking at the query status console output.  My application reads from kafka - performs flatMapGroupsWithState and then aggregates the elements for two group counts.  The output is send to console sink.  I see the following output  (with my questions

Re: Log analysis with GraphX

2018-02-10 Thread Philippe de Rochambeau
Hi Jörn, thank you for replying. By « path analysis », I mean « the user’s navigation from page to page on the website » and by « clicking trends » I mean « which buttons does he/she click and in what order ». In other words, I’d like to measure, make sense out of, and perhaps, predict user

optimize hive query to move a subset of data from one partition table to another table

2018-02-10 Thread amit kumar singh
Hi Team, We have hive external table which has 50 tb of data partitioned on year month day i want to move last 2 month of data into another table when i try to do this through spark ,more than 120k task are getting created what is the best way to do this thanks Rohit

Re: Log analysis with GraphX

2018-02-10 Thread Jörn Franke
What do you mean by path analysis and clicking trends? If you want to use typical graph algorithm such as longest path, shortest path (to detect issues with your navigation page) or page rank then probably yes. Similarly if you do a/b testing to compare if you sell more with different

Log analysis with GraphX

2018-02-10 Thread Philippe de Rochambeau
Hello, Let’s say a website log is structured as follows: ;;; eg. 2018-01-02 12:00:00;OKK;PAG1;1234555 2018-01-02 12:01:01;NEX;PAG1;1234555 2018-01-02 12:00:02;OKK;PAG1;5556667 2018-01-02 12:01:03;NEX;PAG1;5556667 where OKK stands for the OK Button on Page 1, NEX, the Next Button on Page 2, …

Re: Sharing spark executor pool across multiple long running spark applications

2018-02-10 Thread Nirav Patel
I did take a look at SJC earlier. It does look like fits oure use case. It seems to integrated in Datastax too. Apache Livy looks promising as well. I will look into these further. I think for real-time app that needs subsecond latency, spark dynamic allocation won't work. Thanks! On Wed, Feb 7,

can udaf's return complex types?

2018-02-10 Thread kant kodali
Hi All, Can UDAF's return complex types? like say a Map with key as an Integer and the value as an Array of strings? For Example say I have the following *input dataframe* id | name | amount - 1 | foo | 10 2 | bar | 15 1 | car | 20 1 | bus | 20 and