Re: Loading already existing tables in spark shell

2015-08-25 Thread Ishwardeep Singh
Hi Jeetendra,



Please try the following in spark shell. it is like executing an sql command.



sqlContext.sql(use database name)



Regards,

Ishwardeep


From: Jeetendra Gangele gangele...@gmail.com
Sent: Tuesday, August 25, 2015 12:57 PM
To: Ishwardeep Singh
Cc: user
Subject: Re: Loading already existing tables in spark shell

In spark shell use database  not working saying use not found in the shell?
did you ran this with scala shell ?

On 24 August 2015 at 18:26, Ishwardeep Singh 
ishwardeep.si...@impetus.co.inmailto:ishwardeep.si...@impetus.co.in wrote:

Hi Jeetendra,


I faced this issue. I did not specify the database where this table exists. 
Please set the database by using use database command before executing the 
query.


Regards,

Ishwardeep



From: Jeetendra Gangele gangele...@gmail.commailto:gangele...@gmail.com
Sent: Monday, August 24, 2015 5:47 PM
To: user
Subject: Loading already existing tables in spark shell

Hi All I have few tables in hive and I wanted to run query against them with 
spark as execution engine.

Can I direct;y load these tables in spark shell and run query?

I tried with
1.val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)
2.qlContext.sql(FROM event_impressions select count(*)) where 
event_impressions is the table name.

It give me error saying org.apache.spark.sql.AnalysisException: no such table 
event_impressions; line 1 pos 5

Does anybody hit similar issues?


regards
jeetendra








NOTE: This message may contain information that is confidential, proprietary, 
privileged or otherwise protected by law. The message is intended solely for 
the named addressee. If received in error, please destroy and notify the 
sender. Any use of this email is prohibited when received in error. Impetus 
does not represent, warrant and/or guarantee, that the integrity of this 
communication has been maintained nor that the communication is free of errors, 
virus, interception or interference.








NOTE: This message may contain information that is confidential, proprietary, 
privileged or otherwise protected by law. The message is intended solely for 
the named addressee. If received in error, please destroy and notify the 
sender. Any use of this email is prohibited when received in error. Impetus 
does not represent, warrant and/or guarantee, that the integrity of this 
communication has been maintained nor that the communication is free of errors, 
virus, interception or interference.


Re: Loading already existing tables in spark shell

2015-08-24 Thread Ishwardeep Singh
Hi Jeetendra,


I faced this issue. I did not specify the database where this table exists. 
Please set the database by using use database command before executing the 
query.


Regards,

Ishwardeep



From: Jeetendra Gangele gangele...@gmail.com
Sent: Monday, August 24, 2015 5:47 PM
To: user
Subject: Loading already existing tables in spark shell

Hi All I have few tables in hive and I wanted to run query against them with 
spark as execution engine.

Can I direct;y load these tables in spark shell and run query?

I tried with
1.val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)
2.qlContext.sql(FROM event_impressions select count(*)) where 
event_impressions is the table name.

It give me error saying org.apache.spark.sql.AnalysisException: no such table 
event_impressions; line 1 pos 5

Does anybody hit similar issues?


regards
jeetendra








NOTE: This message may contain information that is confidential, proprietary, 
privileged or otherwise protected by law. The message is intended solely for 
the named addressee. If received in error, please destroy and notify the 
sender. Any use of this email is prohibited when received in error. Impetus 
does not represent, warrant and/or guarantee, that the integrity of this 
communication has been maintained nor that the communication is free of errors, 
virus, interception or interference.


RE: Spark SQL support for Hive 0.14

2015-08-05 Thread Ishwardeep Singh
Thanks Steve and Michael for your response.

Is there a tentative release date for Spark 1.5?

From: Michael Armbrust [mailto:mich...@databricks.com]
Sent: Tuesday, August 4, 2015 11:53 PM
To: Steve Loughran ste...@hortonworks.com
Cc: Ishwardeep Singh ishwardeep.si...@impetus.co.in; user@spark.apache.org
Subject: Re: Spark SQL support for Hive 0.14

I'll add that while Spark SQL 1.5 compiles against Hive 1.2.1, it has support 
for reading from metastores for Hive 0.12 - 1.2.1

On Tue, Aug 4, 2015 at 9:59 AM, Steve Loughran 
ste...@hortonworks.commailto:ste...@hortonworks.com wrote:
Spark 1.3.1  1.4 only support Hive 0.13

Spark 1.5 is going to be released against Hive 1.2.1; it'll skip Hive .14 
support entirely and go straight to the currently supported Hive release.

See SPARK-8064 for the gory details

 On 3 Aug 2015, at 23:01, Ishwardeep Singh 
 ishwardeep.si...@impetus.co.inmailto:ishwardeep.si...@impetus.co.in wrote:

 Hi,

 Does spark SQL support Hive 0.14? The documentation refers to Hive 0.13. Is
 there a way to compile spark with Hive 0.14?

 Currently we are using Spark 1.3.1.

 Thanks



 --
 View this message in context: 
 http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-support-for-Hive-0-14-tp24122.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: 
 user-unsubscr...@spark.apache.orgmailto:user-unsubscr...@spark.apache.org
 For additional commands, e-mail: 
 user-h...@spark.apache.orgmailto:user-h...@spark.apache.org




-
To unsubscribe, e-mail: 
user-unsubscr...@spark.apache.orgmailto:user-unsubscr...@spark.apache.org
For additional commands, e-mail: 
user-h...@spark.apache.orgmailto:user-h...@spark.apache.org









NOTE: This message may contain information that is confidential, proprietary, 
privileged or otherwise protected by law. The message is intended solely for 
the named addressee. If received in error, please destroy and notify the 
sender. Any use of this email is prohibited when received in error. Impetus 
does not represent, warrant and/or guarantee, that the integrity of this 
communication has been maintained nor that the communication is free of errors, 
virus, interception or interference.


Spark SQL support for Hive 0.14

2015-08-04 Thread Ishwardeep Singh
Hi,

Does spark SQL support Hive 0.14? The documentation refers to Hive 0.13. Is
there a way to compile spark with Hive 0.14?

Currently we are using Spark 1.3.1.

Thanks 



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-support-for-Hive-0-14-tp24122.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Unable to query existing hive table from spark sql 1.3.0

2015-08-03 Thread Ishwardeep Singh
Your table is in which database - default or result. By default spark will
try to look for table in default database.

If the table exists in the result database try to prefix the table name
with database name like select * from result.salarytest or set the
database by executing use database name 




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-query-existing-hive-table-from-spark-sql-1-3-0-tp24108p24121.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



RE: [Spark SQL 1.3.1] data frame saveAsTable returns exception

2015-05-14 Thread Ishwardeep Singh
Hi Michael  Ayan,

Thank you for your response to my problem.

Michael do we have a tentative release date for Spark version 1.4?

Regards,
Ishwardeep


From: Michael Armbrust [mailto:mich...@databricks.com]
Sent: Wednesday, May 13, 2015 10:54 PM
To: ayan guha
Cc: Ishwardeep Singh; user
Subject: Re: [Spark SQL 1.3.1] data frame saveAsTable returns exception

I think this is a bug in our date handling that should be fixed in Spark 1.4.

On Wed, May 13, 2015 at 8:23 AM, ayan guha 
guha.a...@gmail.commailto:guha.a...@gmail.com wrote:

Your stack trace says it can't convert date to integer. You sure about column 
positions?
On 13 May 2015 21:32, Ishwardeep Singh 
ishwardeep.si...@impetus.co.inmailto:ishwardeep.si...@impetus.co.in wrote:
Hi ,

I am using Spark SQL 1.3.1.

I have created a dataFrame using jdbc data source and am using saveAsTable()
method but got the following 2 exceptions:

java.lang.RuntimeException: Unsupported datatype DecimalType()
at scala.sys.package$.error(package.scala:27)
at
org.apache.spark.sql.parquet.ParquetTypesConverter$$anonfun$fromDataType$2.apply(ParquetTypes.scala:372)
at
org.apache.spark.sql.parquet.ParquetTypesConverter$$anonfun$fromDataType$2.apply(ParquetTypes.scala:316)
at scala.Option.getOrElse(Option.scala:120)
at
org.apache.spark.sql.parquet.ParquetTypesConverter$.fromDataType(ParquetTypes.scala:315)
at
org.apache.spark.sql.parquet.ParquetTypesConverter$$anonfun$4.apply(ParquetTypes.scala:395)
at
org.apache.spark.sql.parquet.ParquetTypesConverter$$anonfun$4.apply(ParquetTypes.scala:394)
at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.immutable.List.foreach(List.scala:318)
at
scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at
org.apache.spark.sql.parquet.ParquetTypesConverter$.convertFromAttributes(ParquetTypes.scala:393)
at
org.apache.spark.sql.parquet.ParquetTypesConverter$.writeMetaData(ParquetTypes.scala:440)
at
org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache.prepareMetadata(newParquet.scala:260)
at
org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache$$anonfun$6.apply(newParquet.scala:276)
at
org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache$$anonfun$6.apply(newParquet.scala:269)
at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.immutable.List.foreach(List.scala:318)
at
scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at
org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache.refresh(newParquet.scala:269)
at
org.apache.spark.sql.parquet.ParquetRelation2.init(newParquet.scala:391)
at
org.apache.spark.sql.parquet.DefaultSource.createRelation(newParquet.scala:98)
at
org.apache.spark.sql.parquet.DefaultSource.createRelation(newParquet.scala:128)
at
org.apache.spark.sql.sources.ResolvedDataSource$.apply(ddl.scala:240)
at
org.apache.spark.sql.hive.execution.CreateMetastoreDataSourceAsSelect.run(commands.scala:218)
at
org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult$lzycompute(commands.scala:54)
at
org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult(commands.scala:54)
at
org.apache.spark.sql.execution.ExecutedCommand.execute(commands.scala:64)
at
org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:1099)
at
org.apache.spark.sql.SQLContext$QueryExecution.toRdd(SQLContext.scala:1099)
at org.apache.spark.sql.DataFrame.saveAsTable(DataFrame.scala:1121)
at org.apache.spark.sql.DataFrame.saveAsTable(DataFrame.scala:1071)
at org.apache.spark.sql.DataFrame.saveAsTable(DataFrame.scala:1037)
at org.apache.spark.sql.DataFrame.saveAsTable(DataFrame.scala:1015)

java.lang.ClassCastException: java.sql.Date cannot be cast to
java.lang.Integer
at scala.runtime.BoxesRunTime.unboxToInt(BoxesRunTime.java:106)
at
org.apache.spark.sql.parquet.RowWriteSupport.writePrimitive(ParquetTableSupport.scala:215)
at
org.apache.spark.sql.parquet.RowWriteSupport.writeValue(ParquetTableSupport.scala:192)
at
org.apache.spark.sql.parquet.RowWriteSupport.write(ParquetTableSupport.scala:171)
at
org.apache.spark.sql.parquet.RowWriteSupport.write(ParquetTableSupport.scala:134)
at
parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:120)
at
parquet.hadoop.ParquetRecordWriter.write(ParquetRecordWriter.java

[Spark SQL 1.3.1] data frame saveAsTable returns exception

2015-05-13 Thread Ishwardeep Singh
Hi ,

I am using Spark SQL 1.3.1.

I have created a dataFrame using jdbc data source and am using saveAsTable()
method but got the following 2 exceptions:

java.lang.RuntimeException: Unsupported datatype DecimalType()
at scala.sys.package$.error(package.scala:27)
at
org.apache.spark.sql.parquet.ParquetTypesConverter$$anonfun$fromDataType$2.apply(ParquetTypes.scala:372)
at
org.apache.spark.sql.parquet.ParquetTypesConverter$$anonfun$fromDataType$2.apply(ParquetTypes.scala:316)
at scala.Option.getOrElse(Option.scala:120)
at
org.apache.spark.sql.parquet.ParquetTypesConverter$.fromDataType(ParquetTypes.scala:315)
at
org.apache.spark.sql.parquet.ParquetTypesConverter$$anonfun$4.apply(ParquetTypes.scala:395)
at
org.apache.spark.sql.parquet.ParquetTypesConverter$$anonfun$4.apply(ParquetTypes.scala:394)
at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.immutable.List.foreach(List.scala:318)
at
scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at
org.apache.spark.sql.parquet.ParquetTypesConverter$.convertFromAttributes(ParquetTypes.scala:393)
at
org.apache.spark.sql.parquet.ParquetTypesConverter$.writeMetaData(ParquetTypes.scala:440)
at
org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache.prepareMetadata(newParquet.scala:260)
at
org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache$$anonfun$6.apply(newParquet.scala:276)
at
org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache$$anonfun$6.apply(newParquet.scala:269)
at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.immutable.List.foreach(List.scala:318)
at
scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at
org.apache.spark.sql.parquet.ParquetRelation2$MetadataCache.refresh(newParquet.scala:269)
at
org.apache.spark.sql.parquet.ParquetRelation2.init(newParquet.scala:391)
at
org.apache.spark.sql.parquet.DefaultSource.createRelation(newParquet.scala:98)
at
org.apache.spark.sql.parquet.DefaultSource.createRelation(newParquet.scala:128)
at
org.apache.spark.sql.sources.ResolvedDataSource$.apply(ddl.scala:240)
at
org.apache.spark.sql.hive.execution.CreateMetastoreDataSourceAsSelect.run(commands.scala:218)
at
org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult$lzycompute(commands.scala:54)
at
org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult(commands.scala:54)
at
org.apache.spark.sql.execution.ExecutedCommand.execute(commands.scala:64)
at
org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:1099)
at
org.apache.spark.sql.SQLContext$QueryExecution.toRdd(SQLContext.scala:1099)
at org.apache.spark.sql.DataFrame.saveAsTable(DataFrame.scala:1121)
at org.apache.spark.sql.DataFrame.saveAsTable(DataFrame.scala:1071)
at org.apache.spark.sql.DataFrame.saveAsTable(DataFrame.scala:1037)
at org.apache.spark.sql.DataFrame.saveAsTable(DataFrame.scala:1015)

java.lang.ClassCastException: java.sql.Date cannot be cast to
java.lang.Integer
at scala.runtime.BoxesRunTime.unboxToInt(BoxesRunTime.java:106)
at
org.apache.spark.sql.parquet.RowWriteSupport.writePrimitive(ParquetTableSupport.scala:215)
at
org.apache.spark.sql.parquet.RowWriteSupport.writeValue(ParquetTableSupport.scala:192)
at
org.apache.spark.sql.parquet.RowWriteSupport.write(ParquetTableSupport.scala:171)
at
org.apache.spark.sql.parquet.RowWriteSupport.write(ParquetTableSupport.scala:134)
at
parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:120)
at
parquet.hadoop.ParquetRecordWriter.write(ParquetRecordWriter.java:81)
at
parquet.hadoop.ParquetRecordWriter.write(ParquetRecordWriter.java:37)
at
org.apache.spark.sql.parquet.ParquetRelation2.org$apache$spark$sql$parquet$ParquetRelation2$$writeShard$1(newParquet.scala:671)
at
org.apache.spark.sql.parquet.ParquetRelation2$$anonfun$insert$2.apply(newParquet.scala:689)
at
org.apache.spark.sql.parquet.ParquetRelation2$$anonfun$insert$2.apply(newParquet.scala:689)
at
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
at org.apache.spark.scheduler.Task.run(Task.scala:64)
at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  

Re: [Spark SQL 1.3.1] data frame saveAsTable returns exception

2015-05-13 Thread Ishwardeep Singh
Hi,

I am using spark-shell and the steps using which I can reproduce the issue
are as follows:

scala val dateDimDF=
sqlContext.load(jdbc,Map(url-jdbc:teradata://192.168.145.58/DBS_PORT=1025,DATABASE=BENCHQADS,LOB_SUPPORT=OFF,USER=
BENCHQADS,PASSWORD=abc,dbtable - date_dim)) 

scala dateDimDF.printSchema()

root
 |-- d_date_sk: integer (nullable = false)
 |-- d_date_id: string (nullable = false)
 |-- d_date: date (nullable = true)
 |-- d_month_seq: integer (nullable = true)
 |-- d_week_seq: integer (nullable = true)
 |-- d_quarter_seq: integer (nullable = true)
 |-- d_year: integer (nullable = true)
 |-- d_dow: integer (nullable = true)
 |-- d_moy: integer (nullable = true)
 |-- d_dom: integer (nullable = true)
 |-- d_qoy: integer (nullable = true)
 |-- d_fy_year: integer (nullable = true)
 |-- d_fy_quarter_seq: integer (nullable = true)
 |-- d_fy_week_seq: integer (nullable = true)
 |-- d_day_name: string (nullable = true)
 |-- d_quarter_name: string (nullable = true)
 |-- d_holiday: string (nullable = true)
 |-- d_weekend: string (nullable = true)
 |-- d_following_holiday: string (nullable = true)
 |-- d_first_dom: integer (nullable = true)
 |-- d_last_dom: integer (nullable = true)
 |-- d_same_day_ly: integer (nullable = true)
 |-- d_same_day_lq: integer (nullable = true)
 |-- d_current_day: string (nullable = true)
 |-- d_current_week: string (nullable = true)
 |-- d_current_month: string (nullable = true)
 |-- d_current_quarter: string (nullable = true)
 |-- d_current_year: string (nullable = true)

scala dateDimDF.saveAsTable(date_dim_tera_save)

15/05/13 19:57:05 INFO JDBCRDD: closed connection
15/05/13 19:57:05 ERROR Executor: Exception in task 0.0 in stage 2.0 (TID 2)
java.lang.ClassCastException: java.sql.Date cannot be cast to
java.lang.Integer
at scala.runtime.BoxesRunTime.unboxToInt(BoxesRunTime.java:106)
at
org.apache.spark.sql.parquet.RowWriteSupport.writePrimitive(ParquetTableSupport.scala:215)
at
org.apache.spark.sql.parquet.RowWriteSupport.writeValue(ParquetTableSupport.scala:192)
at
org.apache.spark.sql.parquet.RowWriteSupport.write(ParquetTableSupport.scala:171)
at
org.apache.spark.sql.parquet.RowWriteSupport.write(ParquetTableSupport.scala:134)
at
parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:120)
at
parquet.hadoop.ParquetRecordWriter.write(ParquetRecordWriter.java:81)
at
parquet.hadoop.ParquetRecordWriter.write(ParquetRecordWriter.java:37)
at
org.apache.spark.sql.parquet.ParquetRelation2.org$apache$spark$sql$parquet$ParquetRelation2$$writeShard$1(newParquet.scala:671)
at
org.apache.spark.sql.parquet.ParquetRelation2$$anonfun$insert$2.apply(newParquet.scala:689)
at
org.apache.spark.sql.parquet.ParquetRelation2$$anonfun$insert$2.apply(newParquet.scala:689)
at
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
at org.apache.spark.scheduler.Task.run(Task.scala:64)
at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)
15/05/13 19:57:05 WARN TaskSetManager: Lost task 0.0 in stage 2.0 (TID 2,
localhost): java.lang.ClassCastException: java.sql.Date cannot be cast to
java.lang.Integer
at scala.runtime.BoxesRunTime.unboxToInt(BoxesRunTime.java:106)
at
org.apache.spark.sql.parquet.RowWriteSupport.writePrimitive(ParquetTableSupport.scala:215)
at
org.apache.spark.sql.parquet.RowWriteSupport.writeValue(ParquetTableSupport.scala:192)
at
org.apache.spark.sql.parquet.RowWriteSupport.write(ParquetTableSupport.scala:171)
at
org.apache.spark.sql.parquet.RowWriteSupport.write(ParquetTableSupport.scala:134)
at
parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:120)
at
parquet.hadoop.ParquetRecordWriter.write(ParquetRecordWriter.java:81)
at
parquet.hadoop.ParquetRecordWriter.write(ParquetRecordWriter.java:37)
at
org.apache.spark.sql.parquet.ParquetRelation2.org$apache$spark$sql$parquet$ParquetRelation2$$writeShard$1(newParquet.scala:671)
at
org.apache.spark.sql.parquet.ParquetRelation2$$anonfun$insert$2.apply(newParquet.scala:689)
at
org.apache.spark.sql.parquet.ParquetRelation2$$anonfun$insert$2.apply(newParquet.scala:689)
at
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
at org.apache.spark.scheduler.Task.run(Task.scala:64)
at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)


scala  val 

Re: Unable to join table across data sources using sparkSQL

2015-05-08 Thread Ishwardeep Singh
Finally got it working.

I was trying to access hive using the jdbc driver like I was trying to
access the terradata. 

It took me some time to figure out that default sqlContext created by Spark
supported hive and it uses the hive-site.xml in spark conf folder to access
hive.

I had to use my database in hive.

spark-shell sqlContext.sql(use terradata_live)

Then I registered by terradata database tables as temporary tables.

spark-shell  val itemDF=
hc.load(jdbc,Map(url-jdbc:teradata://192.168.145.58/DBS_PORT=1025,DATABASE=BENCHQADS,LOB_SUPPORT=OFF,USER=
BENCHQADS,PASSWORD=,dbtable - item)) 

spark-shell itemDF.registerTempTable(itemterra)

spark-shell sqlContext.sql(select store_sales.* from store_sales join
itemterra on (store_sales.id = itemterra.sales_id)

But these seems to be some issue when I try to do the same using hive jdbc
driver. Another difference that I found was in printSchema() output.
printSchema() output for data frame created using hive driver prefixes the
column names with table name but the same does not happen for terradata
tables.




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-join-table-across-data-sources-using-sparkSQL-tp22761p22816.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



RE: Unable to join table across data sources using sparkSQL

2015-05-05 Thread Ishwardeep Singh
Hi Ankit,

printSchema() works fine for all the tables.

hiveStoreSalesDF.printSchema()
root
|-- store_sales.ss_sold_date_sk: integer (nullable = true)
|-- store_sales.ss_sold_time_sk: integer (nullable = true)
|-- store_sales.ss_item_sk: integer (nullable = true)
|-- store_sales.ss_customer_sk: integer (nullable = true)
|-- store_sales.ss_cdemo_sk: integer (nullable = true)
|-- store_sales.ss_hdemo_sk: integer (nullable = true)
|-- store_sales.ss_addr_sk: integer (nullable = true)
|-- store_sales.ss_store_sk: integer (nullable = true)
|-- store_sales.ss_promo_sk: integer (nullable = true)
|-- store_sales.ss_ticket_number: integer (nullable = true)
|-- store_sales.ss_quantity: integer (nullable = true)
|-- store_sales.ss_wholesale_cost: double (nullable = true)
|-- store_sales.ss_list_price: double (nullable = true)
|-- store_sales.ss_sales_price: double (nullable = true)
|-- store_sales.ss_ext_discount_amt: double (nullable = true)
|-- store_sales.ss_ext_sales_price: double (nullable = true)
|-- store_sales.ss_ext_wholesale_cost: double (nullable = true)
|-- store_sales.ss_ext_list_price: double (nullable = true)
|-- store_sales.ss_ext_tax: double (nullable = true)
|-- store_sales.ss_coupon_amt: double (nullable = true)
|-- store_sales.ss_net_paid: double (nullable = true)
|-- store_sales.ss_net_paid_inc_tax: double (nullable = true)
|-- store_sales.ss_net_profit: double (nullable = true)

dateDimDF.printSchema()
root
|-- d_date_sk: integer (nullable = false)
|-- d_date_id: string (nullable = false)
|-- d_date: date (nullable = true)
|-- d_month_seq: integer (nullable = true)
|-- d_week_seq: integer (nullable = true)
|-- d_quarter_seq: integer (nullable = true)
|-- d_year: integer (nullable = true)
|-- d_dow: integer (nullable = true)
|-- d_moy: integer (nullable = true)
|-- d_dom: integer (nullable = true)
|-- d_qoy: integer (nullable = true)
|-- d_fy_year: integer (nullable = true)
|-- d_fy_quarter_seq: integer (nullable = true)
|-- d_fy_week_seq: integer (nullable = true)
|-- d_day_name: string (nullable = true)
|-- d_quarter_name: string (nullable = true)
|-- d_holiday: string (nullable = true)
|-- d_weekend: string (nullable = true)
|-- d_following_holiday: string (nullable = true)
|-- d_first_dom: integer (nullable = true)
|-- d_last_dom: integer (nullable = true)
|-- d_same_day_ly: integer (nullable = true)
|-- d_same_day_lq: integer (nullable = true)
|-- d_current_day: string (nullable = true)
|-- d_current_week: string (nullable = true)
|-- d_current_month: string (nullable = true)
|-- d_current_quarter: string (nullable = true)
|-- d_current_year: string (nullable = true)

itemDF.printSchema()
root
|-- i_item_sk: integer (nullable = false)
|-- i_item_id: string (nullable = false)
|-- i_rec_start_date: date (nullable = true)
|-- i_rec_end_date: date (nullable = true)
|-- i_item_desc: string (nullable = true)
|-- i_current_price: decimal (nullable = true)
|-- i_wholesale_cost: decimal (nullable = true)
|-- i_brand_id: integer (nullable = true)
|-- i_brand: string (nullable = true)
|-- i_class_id: integer (nullable = true)
|-- i_class: string (nullable = true)
|-- i_category_id: integer (nullable = true)
|-- i_category: string (nullable = true)
|-- i_manufact_id: integer (nullable = true)
|-- i_manufact: string (nullable = true)
|-- i_size: string (nullable = true)
|-- i_formulation: string (nullable = true)
|-- i_color: string (nullable = true)
|-- i_units: string (nullable = true)
|-- i_container: string (nullable = true)
|-- i_manager_id: integer (nullable = true)
|-- i_product_name: string (nullable = true)

Regards,
Ishwardeep

From: ankitjindal [via Apache Spark User List] 
[mailto:ml-node+s1001560n22766...@n3.nabble.com]
Sent: Tuesday, May 5, 2015 5:00 PM
To: Ishwardeep Singh
Subject: RE: Unable to join table across data sources using sparkSQL

Just check the Schema of both the tables using frame.printSchema();

If you reply to this email, your message will be added to the discussion below:
http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-join-table-across-data-sources-using-sparkSQL-tp22761p22766.html
To unsubscribe from Unable to join table across data sources using sparkSQL, 
click 
herehttp://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=22761code=aXNod2FyZGVlcC5zaW5naEBpbXBldHVzLmNvLmlufDIyNzYxfDgzMDExNzI4OQ==.
NAMLhttp://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml








NOTE: This message may contain information that is confidential, proprietary, 
privileged or otherwise protected by law. The message is intended solely

RE: Unable to join table across data sources using sparkSQL

2015-05-05 Thread Ishwardeep Singh
Hi ,

I am using Spark 1.3.0.

I was able to join a JSON file on HDFS registered as a TempTable with a table 
in MySQL. On the same lines I tried to join a table in Hive with another table 
in Teradata but I get a query parse exception.

Regards,
Ishwardeep


From: ankitjindal [via Apache Spark User List] 
[mailto:ml-node+s1001560n22762...@n3.nabble.com]
Sent: Tuesday, May 5, 2015 1:26 PM
To: Ishwardeep Singh
Subject: Re: Unable to join table across data sources using sparkSQL

Hi

I was doing the same but with a file in hadoop as a temp table and one 
table in sql server but i succeeded in it.

Which spark version are you using currently?

Thanks
Ankit


If you reply to this email, your message will be added to the discussion below:
http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-join-table-across-data-sources-using-sparkSQL-tp22761p22762.html
To unsubscribe from Unable to join table across data sources using sparkSQL, 
click 
herehttp://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=22761code=aXNod2FyZGVlcC5zaW5naEBpbXBldHVzLmNvLmlufDIyNzYxfDgzMDExNzI4OQ==.
NAMLhttp://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml








NOTE: This message may contain information that is confidential, proprietary, 
privileged or otherwise protected by law. The message is intended solely for 
the named addressee. If received in error, please destroy and notify the 
sender. Any use of this email is prohibited when received in error. Impetus 
does not represent, warrant and/or guarantee, that the integrity of this 
communication has been maintained nor that the communication is free of errors, 
virus, interception or interference.




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-join-table-across-data-sources-using-sparkSQL-tp22761p22763.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

RE: Unable to compile spark 1.1.0 on windows 8.1

2014-12-01 Thread Ishwardeep Singh
Hi Judy,

Thank you for your response.

When I try to compile using maven mvn -Dhadoop.version=1.2.1 -DskipTests
clean package I get an error Error: Could not find or load main class . 
I have maven 3.0.4.

And when I run command sbt package I get the same exception as earlier.

I have done the following steps:

1. Download spark-1.1.0.tgz from the spark site and unzip the compressed zip
to a folder d:\myworkplace\software\spark-1.1.0
2. Then I downloaded sbt-0.13.7.zip and extract it to folder
d:\myworkplace\software\sbt
3. Update the PATH environment variable to include
d:\myworkplace\software\sbt\bin in the PATH.
4. Navigate to spark folder d:\myworkplace\software\spark-1.1.0
5. Run the command sbt assembly
6. As a side effect of this command a number of libraries are downloaded and
I get an initial error that path
C:\Users\ishwardeep.singh\.sbt\0.13\staging\ec3aa8f39111944cc5f2\sbt-pom-reader
does not exist. 
7. I manually create this subfolder ec3aa8f39111944cc5f2\sbt-pom-reader
and retry to get the next error as described in my initial error.

Is this the correct procedure to compile spark 1.1.0? Please let me know.

Hoping to hear from you soon.

Regards,
ishwardeep



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-compile-spark-1-1-0-on-windows-8-1-tp19996p20075.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Unable to compile spark 1.1.0 on windows 8.1

2014-11-27 Thread Ishwardeep Singh
Hi,

I am trying to compile spark 1.1.0 on windows 8.1 but I get the following
exception. 

[info] Compiling 3 Scala sources to
D:\myworkplace\software\spark-1.1.0\project\target\scala-2.10\sbt0.13\classes...
[error] D:\myworkplace\software\spark-1.1.0\project\SparkBuild.scala:26:
object sbt is not a member of package com.typesafe
[error] import com.typesafe.sbt.pom.{PomBuild, SbtPomKeys}
[error] ^
[error] D:\myworkplace\software\spark-1.1.0\project\SparkBuild.scala:53: not
found: type PomBuild
[error] object SparkBuild extends PomBuild {
[error]   ^
[error] D:\myworkplace\software\spark-1.1.0\project\SparkBuild.scala:121:
not found: value SbtPomKeys
[error] otherResolvers = SbtPomKeys.mvnLocalRepository(dotM2 =
Seq(Resolver.file(dotM2, dotM2))),
[error]^
[error] D:\myworkplace\software\spark-1.1.0\project\SparkBuild.scala:165:
value projectDefinitions is not a member of AnyRef
[error] super.projectDefinitions(baseDirectory).map { x =
[error]   ^
[error] four errors found
[error] (plugins/compile:compile) Compilation failed

I have also setup scala 2.10.

Need help to resolve this issue.

Regards,
Ishwardeep 



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-compile-spark-1-1-0-on-windows-8-1-tp19996.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org