[ANNOUNCE] .NET for Apache Spark™ 2.1 released

2022-02-02 Thread Terry Kim
improvements of this release. Here are the some of the highlights: - Support for Apache Spark 3.2 - Exposing new SQL function APIs introduced in Spark 3.2 We would like to thank the community for the great feedback and all those who contributed to this release. Thanks, Terry Kim on behalf of the .NE

Announcing Hyperspace v0.4.0 - an indexing subsystem for Apache Spark™

2021-02-08 Thread Terry Kim
PR to support Iceberg tables. We would like to thank the community for the great feedback and all those who contributed to this release. Thanks, Terry Kim on behalf of the Hyperspace team

Re: [Spark SQL]HiveQL and Spark SQL producing different results

2021-01-12 Thread Terry Kim
Ying, Can you share a query that produces different results? Thanks, Terry On Sun, Jan 10, 2021 at 1:48 PM Ying Zhou wrote: > Hi, > > I run some SQL using both Hive and Spark. Usually we get the same results. > However when a window function is in the script Hive and Spark

Announcing Hyperspace v0.3.0 - an indexing subsystem for Apache Spark™

2020-11-17 Thread Terry Kim
nd all those who contributed to this release. Thanks, Terry Kim on behalf of the Hyperspace team

Announcing .NET for Apache Spark™ 1.0

2020-11-06 Thread Terry Kim
- Support for all the complex types in Spark SQL - Support for Delta Lake <https://github.com/delta-io/delta> v0.7 and Hyperspace <https://github.com/microsoft/hyperspace> v0.2 We would like to thank the community for the great feedback and all those who contributed to this release

Re: Renaming a DataFrame column makes Spark lose partitioning information

2020-08-04 Thread Terry Kim
on($"c") .explain() // Exiting paste mode, now interpreting. == Physical Plan == *(1) Project [a#7, b#8 AS c#11] +- Exchange hashpartitioning(b#8, 200), false, [id=#12] +- LocalTableScan [a#7, b#8] Thanks, Terry On Tue, Aug 4, 2020 at 6:26 AM Antoine Wendlinger wrote:

Re: Future timeout

2020-07-20 Thread Terry Kim
"spark.sql.broadcastTimeout" is the config you can use: https://github.com/apache/spark/blob/fe07521c9efd9ce0913eee0d42b0ffd98b1225ec/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala#L863 Thanks, Terry On Mon, Jul 20, 2020 at 11:20 AM Amit Sharma wrote: >

Announcing .NET for Apache Spark™ 0.12

2020-07-02 Thread Terry Kim
4.6 (3.0 support is on the way!) - SparkSession.CreateDataFrame, Broadcast variable - Preliminary support for MLLib (TF-IDF, Word2Vec, Bucketizer, etc.) - Support for .NET Core 3.1 We would like to thank all those who contributed to this release. Thanks, Terry Kim on behalf of the .NET for Apache Spark™ team

Hyperspace v0.1 is now open-sourced!

2020-07-02 Thread Terry Kim
-an-indexing-subsystem-for-apache-spark - Docs: https://aka.ms/hyperspace This project would not have been possible without the outstanding work from the Apache Spark™ community. Thank you everyone and we look forward to collaborating with the community towards evolving Hyperspace. Thanks, Terry Kim

Re: Using existing distribution for join when subset of keys

2020-05-31 Thread Terry Kim
ocation: InMemoryFileIndex[file:/], PartitionFilters: [], PushedFilters: [IsNotNull(x), IsNotNull(y)], ReadSchema: struct, SelectedBucketsCount: 8 out of 8 On Sun, May 31, 2020 at 2:38 PM Patrick Woody wrote: > Hey Terry, > > Thanks for the response! I'm not sure that it ends up working

Re: Using existing distribution for join when subset of keys

2020-05-31 Thread Terry Kim
You can use bucketBy to avoid shuffling in your scenario. This test suite has some examples: https://github.com/apache/spark/blob/45cf5e99503b00a6bd83ea94d6d92761db1a00ab/sql/core/src/test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala#L343 Thanks, Terry On Sun, May 31, 2020 at 7:43

Re: [Spark SQL]: Does namespace name is always needed in a query for tables from a user defined catalog plugin

2019-12-01 Thread Terry Kim
ARK-30094> and will follow up. Thanks, Terry On Sun, Dec 1, 2019 at 7:12 PM xufei wrote: > Hi, > > I'm trying to write a catalog plugin based on spark-3.0-preview, and I > found even when I use 'use catalog.namespace' to set the current catalog > and namespace, I still need to

Announcing .NET for Apache Spark 0.5.0

2019-09-30 Thread Terry Kim
- Support for Spark 2.3.4/2.4.4 The release notes <https://github.com/dotnet/spark/blob/master/docs/release-notes/0.5/release-0.5.md> include the full list of features/improvements of this release. We would like to thank all those who contributed to this release. Thanks, Terry

Re: Release Apache Spark 2.4.4

2019-08-13 Thread Terry Kim
Can the following be included? [SPARK-27234][SS][PYTHON] Use InheritableThreadLocal for current epoch in EpochTracker (to support Python UDFs) <https://github.com/apache/spark/pull/24946> Thanks, Terry On Tue, Aug 13, 2019 at 10:24 PM Wenchen Fan wrote: > +1 > > On Wed, Aug 14

Announcing .NET for Apache Spark 0.4.0

2019-07-31 Thread Terry Kim
oading - Local UDF debugging The release notes <https://github.com/dotnet/spark/blob/master/docs/release-notes/0.4/release-0.4.md> include the full list of features/improvements of this release. We would like to thank all those who contributed to this release. Thanks, Terry

The last successful batch before stop re-execute after restart the DStreams with checkpoint

2018-03-11 Thread Terry Hoo
. Regards - Terry

Re: Getting memory error when starting spark shell but not often

2016-09-06 Thread Terry Hoo
Maybe not enough continues memory (10G?) in your host Regards, - Terry On Wed, Sep 7, 2016 at 10:51 AM, Divya Gehlot <divya.htco...@gmail.com> wrote: > Hi, > I am using EMR 4.7 with Spark 1.6 > Sometimes when I start the spark shell I get below error > > OpenJDK 64-Bit Se

Re: spark2.0 how to use sparksession and StreamingContext same time

2016-07-25 Thread Terry Hoo
Kevin, Try to create the StreamingContext as following: val ssc = new StreamingContext(spark.sparkContext, Seconds(2)) On Tue, Jul 26, 2016 at 11:25 AM, kevin wrote: > hi,all: > I want to read data from kafka and regist as a table then join a jdbc > table. > My

Re: Another problem about parallel computing

2016-06-13 Thread Terry Hoo
hero, Did you check whether there is any exception after retry? If the port is 0, the spark worker should bind to a random port. BTW, what's the spark version? Regards, - Terry On Mon, Jun 13, 2016 at 4:24 PM, hero <super_big_h...@sina.com> wrote: > Hi, guys > > I have anothe

Re: StackOverflow in Spark

2016-06-13 Thread Terry Hoo
Maybe the same issue with SPARK_6847 <https://issues.apache.org/jira/browse/SPARK-6847>, which has been fixed in spark 2.0 Regards - Terry On Mon, Jun 13, 2016 at 3:15 PM, Michel Hubert <mich...@phact.nl> wrote: > > > I’ve found my problem. > > > > I’

ArrayIndexOutOfBoundsException in model selection via cross-validation sample with spark 1.6.1

2016-05-04 Thread Terry Hoo
submit(SparkSubmit.scala:206) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Regards - Terry

Re: Number of batches in the Streaming Statics visualization screen

2016-01-29 Thread Terry Hoo
Yes, the data is stored in driver memory. Mehdi Ben Haj Abbes <mehdi.ab...@gmail.com>于2016年1月29日星期五 18:13写道: > Thanks Terry for the quick answer. > > I did not tried it. Lets say I will increase the value to 2, what > side effect should I expect. In fact in the explanati

Re: Number of batches in the Streaming Statics visualization screen

2016-01-29 Thread Terry Hoo
Hi Mehdi, Do you try a larger value of "spark.streaming.ui.retainedBatches"(default is 1000)? Regards, - Terry On Fri, Jan 29, 2016 at 5:45 PM, Mehdi Ben Haj Abbes <mehdi.ab...@gmail.com> wrote: > Hi folks, > > I have a streaming job running for mor

Re: [Spark 1.6][Streaming] About the behavior of mapWithState

2016-01-17 Thread Terry Hoo
state. Regards, -Terry On Sat, Jan 16, 2016 at 6:20 AM, Shixiong(Ryan) Zhu <shixi...@databricks.com > wrote: > Hey Terry, > > That's expected. If you want to only output (1, 3), you can use > "reduceByKey" before "mapWithState" like this: > > dstream.redu

[Spark 1.6][Streaming] About the behavior of mapWithState

2016-01-15 Thread Terry Hoo
ems with the same key "1": (1,1) and (1,3), is this expected behavior? I would expect (1,3) only. Regards - Terry

[Streaming] Long time to catch up when streaming application restarts from checkpoint

2015-11-06 Thread Terry Hoo
kip these batches or to speed up the catch up processing?* Thanks! Terry

[SQL] Memory leak with spark streaming and spark sql in spark 1.5.1

2015-10-14 Thread Terry Hoo
LListener has about 1K entries), is this a leak in SQLListener? Thanks! Terry

Re: Streaming Application Unable to get Stream from Kafka

2015-10-09 Thread Terry Hoo
Hi Prateek, How many cores (threads) do you assign to spark in local mode? It is very likely the local spark does not have enough resource to proceed. You can check http://yourip:4040 to check the details. Thanks! Terry On Fri, Oct 9, 2015 at 10:34 PM, Prateek . <prat...@aricent.com>

Re: Cant perform full outer join

2015-09-29 Thread Terry Hoo
Saif, Might be you can rename one of the dataframe to different name first, then do an outer join and a select like this: val cur_d = cur_data.toDF("Date_1", "Value_1") val r = data.join(cur_d, data("DATE" === cur_d("Date_1", "outer").select($&q

Re: Why Checkpoint is throwing "actor.OneForOneStrategy: NullPointerException"

2015-09-24 Thread Terry Hoo
I met this before: in my program, some DStreams are not initialized since they are not in the path of of output. You can check if you are the same case. Thanks! - Terry On Fri, Sep 25, 2015 at 10:22 AM, Tathagata Das <t...@databricks.com> wrote: > Are you by any chanc

Re: How to convert dataframe to a nested StructType schema

2015-09-15 Thread Terry Hole
Hao, For spark 1.4.1, you can try this: val rowrdd = df.rdd.map(r => Row(Row(r(3)), Row(r(0), r(1), r(2 val newDF = sqlContext.createDataFrame(rowrdd, yourNewSchema) Thanks! - Terry On Wed, Sep 16, 2015 at 2:10 AM, Hao Wang <billhao.l...@gmail.com> wrote: > Hi, > > I c

Re: Meets "java.lang.IllegalArgumentException" when test spark ml pipe with DecisionTreeClassifier

2015-09-09 Thread Terry Hole
= { val na = NominalAttribute.defaultAttr.withValues("0", "1") na.toMetadata(m) } val newSchema = StructType(schema.map(f => if (f.name == "label") f.copy(metadata=enrich(f.metadata)) else f)) val model = pipeline.fit(sqlContext.createDataFrame(rowRDD, newSchem

Re: Meets "java.lang.IllegalArgumentException" when test spark ml pipe with DecisionTreeClassifier

2015-09-07 Thread Terry Hole
Xiangrui, Do you have any idea how to make this work? Thanks - Terry Terry Hole <hujie.ea...@gmail.com>于2015年9月6日星期日 17:41写道: > Sean > > Do you know how to tell decision tree that the "label" is a binary or set > some attributes to dataframe to carry number of cl

Re: Meets "java.lang.IllegalArgumentException" when test spark ml pipe with DecisionTreeClassifier

2015-09-06 Thread Terry Hole
at $iwC$$iwC$$iwC.(:72) at $iwC$$iwC.(:74) at $iwC.(:76) at (:78) at .(:82) at .() at .(:7) at .() at $print() Thanks! - Terry On Sun, Sep 6, 2015 at 4:53 PM, Sean Owen <so...@cloudera.com> wrote: > I think somewhere alone

Re: Meets "java.lang.IllegalArgumentException" when test spark ml pipe with DecisionTreeClassifier

2015-09-06 Thread Terry Hole
Sean Do you know how to tell decision tree that the "label" is a binary or set some attributes to dataframe to carry number of classes? Thanks! - Terry On Sun, Sep 6, 2015 at 5:23 PM, Sean Owen <so...@cloudera.com> wrote: > (Sean) > The error suggests that the type is n

SparkSQL without access to arrays?

2015-09-03 Thread Terry
Hi, i'm using Spark 1.4.1. Here is de printSchema after load my json file: root |-- result: struct (nullable = true) ||-- negative_votes: long (nullable = true) ||-- players: array (nullable = true) ||||-- account_id: long (nullable = true) ||||-- assists:

Re: Job aborted due to stage failure: java.lang.StringIndexOutOfBoundsException: String index out of range: 18

2015-08-28 Thread Terry Hole
Ricky, You may need to use map instead of flatMap in your case *val rowRDD=sc.textFile(/user/spark/short_model).map(_.split(\\t)).map(p = Row(...))* Thanks! -Terry On Fri, Aug 28, 2015 at 5:08 PM, our...@cnsuning.com our...@cnsuning.com wrote: hi all, when using spark sql ,A problem

Re: standalone to connect mysql

2015-07-21 Thread Terry Hole
Jack, You can refer the hive sql syntax if you use HiveContext: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML Thanks! -Terry That works! Thanks. Can I ask you one further question? How did spark sql support insertion? That is say, if I did: sqlContext.sql(insert

Re: standalone to connect mysql

2015-07-21 Thread Terry Hole
Maybe you can try: spark-submit --class sparkwithscala.SqlApp --jars /home/lib/mysql-connector-java-5.1.34.jar --master spark://hadoop1:7077 /home/myjar.jar Thanks! -Terry Hi there, I would like to use spark to access the data in mysql. So firstly I tried to run the program using

Re: [Spark Shell] Could the spark shell be reset to the original status?

2015-07-16 Thread Terry Hole
. Thanks! - Terry Ted Yu yuzhih...@gmail.com于2015年7月17日周五 下午12:02写道: See this recent thread: http://search-hadoop.com/m/q3RTtFW7iMDkrj61/Spark+shell+oom+subj=java+lang+OutOfMemoryError+PermGen+space On Jul 16, 2015, at 8:51 PM, Terry Hole hujie.ea...@gmail.com wrote: Hi, Background

[Spark Shell] Could the spark shell be reset to the original status?

2015-07-16 Thread Terry Hole
\Local\Temp\spark-2ad09490-c0c6-41e2-addb-63087ce0ae63' but it is not a directory That entry seems to have slain the compiler. Shall I replayyour session? I can re-run each line except the last one.[y/n] Abandoning crashed session. Thanks! -Terry

Re: fileStream with old files

2015-07-15 Thread Terry Hole
Hi, Hunter, *What **behavior do you see with the HDFS? The local file system and HDFS should have the same ** behavior.* *Thanks!* *- Terry* Hunter Morgan hunter.mor...@rackspace.com于2015年7月16日周四 上午2:04写道: After moving the setting of the parameter to SparkConf initialization instead

Re: fileStream with old files

2015-07-13 Thread Terry Hole
://issues.apache.org/jira/browse/SPARK-3276 -Terry On Tue, Jul 14, 2015 at 4:44 AM, automaticgiant hunter.mor...@rackspace.com wrote: It's not as odd as it sounds. I want to ensure that long streaming job outages can recover all the files that went into a directory while the job was down. I've looked

Re: [Spark Hive SQL] Set the hive connection in hive context is broken in spark 1.4.1-rc1?

2015-07-10 Thread Terry Hole
Michael, Thanks - Terry Michael Armbrust mich...@databricks.com于2015年7月11日星期六 04:02写道: Metastore configuration should be set in hive-site.xml. On Thu, Jul 9, 2015 at 8:59 PM, Terry Hole hujie.ea...@gmail.com wrote: Hi, I am trying to set the hive metadata destination to a mysql database

[Spark Hive SQL] Set the hive connection in hive context is broken in spark 1.4.1-rc1?

2015-07-09 Thread Terry Hole
()* *Thanks!* *-Terry*

Re: Is there a way to shutdown the derby in hive context in spark shell?

2015-07-09 Thread Terry Hole
and creating a new one? Thanks Best Regards On Wed, Jul 8, 2015 at 8:12 PM, Terry Hole hujie.ea...@gmail.com wrote: I am using spark 1.4.1rc1 with default hive settings Thanks - Terry Hi All, I'd like to use the hive context in spark shell, i need to recreate the hive meta database in the same

Is there a way to shutdown the derby in hive context in spark shell?

2015-07-08 Thread Terry Hole
(jdbc:derby:;shutdown=true); Thanks! - Terry

Re: Is there a way to shutdown the derby in hive context in spark shell?

2015-07-08 Thread Terry Hole
I am using spark 1.4.1rc1 with default hive settings Thanks - Terry Hi All, I'd like to use the hive context in spark shell, i need to recreate the hive meta database in the same location, so i want to close the derby connection previous created in the spark shell, is there any way to do

Re: Meets class not found error in spark console with newly hive context

2015-07-02 Thread Terry Hole
Found this a bug in spark 1.4.0: SPARK-8368 https://issues.apache.org/jira/browse/SPARK-8368 Thanks! Terry On Thu, Jul 2, 2015 at 1:20 PM, Terry Hole hujie.ea...@gmail.com wrote: All, I am using spark console 1.4.0 to do some tests, when a create a newly HiveContext (Line 18 in the code

Meets class not found error in spark console with newly hive context

2015-07-01 Thread Terry Hole
###);ssc.stop(false, true); duration = 0; isRun = false} }}33 ssc.awaitTermination()34 println( Streaming context terminated.)35 }36 37 streamingTest(null)38 Thanks Terry

Re: Is it possible to set the akka specify properties (akka.extensions) in spark

2015-05-11 Thread Terry Hole
,Whatever), underneath i think spark won't ship properties which don't start with spark.* to the executors. Thanks Best Regards On Mon, May 11, 2015 at 8:33 AM, Terry Hole hujie.ea...@gmail.com wrote: Hi all, I'd like to monitor the akka using kamon, which need to set the akka.extension

Is it possible to set the akka specify properties (akka.extensions) in spark

2015-05-10 Thread Terry Hole
Hi all, I'd like to monitor the akka using kamon, which need to set the akka.extension to a list like this in typesafe config format: akka { extensions = [kamon.system.SystemMetrics, kamon.statsd.StatsD] } But i can not find a way to do this, i have tried these: 1.

Is it possible to set the akka specify properties (akka.extensions) in spark

2015-05-10 Thread Terry Hole
Hi all, I'd like to monitor the akka using kamon, which need to set the akka.extension to a list like this in typesafe config format: akka { extensions = [kamon.system.SystemMetrics, kamon.statsd.StatsD] } But i can not find a way to do this, i have tried these: 1.

Is it possible to set the akka specify properties (akka.extensions) in spark

2015-05-07 Thread Terry Hole
Hi all, I'd like to monitor the akka using kamon, which need to set the akka.extension to a list like this in typesafe config format: akka { extensions = [kamon.system.SystemMetrics, kamon.statsd.StatsD] } But i can not find a way to do this, i have tried these: 1.

Re: spark 1.3.0 strange log message

2015-04-23 Thread Terry Hole
Use this in spark conf: spark.ui.showConsoleProgress=false Best Regards, On Fri, Apr 24, 2015 at 11:23 AM, Henry Hung ythu...@winbond.com wrote: Dear All, When using spark 1.3.0 spark-submit with directing out and err to a log file, I saw some strange lines inside that looks like this:

Fwd: [Spark Streaming] The FileInputDStream newFilesOnly=false does not work in 1.2 since

2015-01-20 Thread Terry Hole
- Terry

Re: Unable to use HiveContext in spark-shell

2014-11-06 Thread Terry Siu
What version of Spark are you using? Did you compile your Spark version and if so, what compile options did you use? On 11/6/14, 9:22 AM, tridib tridib.sama...@live.com wrote: Help please! -- View this message in context:

Re: Unable to use HiveContext in spark-shell

2014-11-06 Thread Terry Siu
? From: Tridib Samanta tridib.sama...@live.commailto:tridib.sama...@live.com Date: Thursday, November 6, 2014 at 9:49 AM To: Terry Siu terry@smartfocus.commailto:terry@smartfocus.com, u...@spark.incubator.apache.orgmailto:u...@spark.incubator.apache.org u...@spark.incubator.apache.orgmailto:u

NoClassDefFoundError encountered in Spark 1.2-snapshot build with hive-0.13.1 profile

2014-11-03 Thread Terry Siu
above, it completed without any errors. I’m wondering what sort of benefit there is to include the -Phive-0.13.1 profile into the build as it looks like there’s some shaded jar action going on. Thanks, -Terry

ParquetFilters and StringType support for GT, GTE, LT, LTE

2014-11-03 Thread Terry Siu
this morning and now the same query will give me a MatchError for this column of string type. Thanks, -Terry

Re: NoClassDefFoundError encountered in Spark 1.2-snapshot build with hive-0.13.1 profile

2014-11-03 Thread Terry Siu
Thanks, Kousuke. I’ll wait till this pull request makes it into the master branch. -Terry From: Kousuke Saruta saru...@oss.nttdata.co.jpmailto:saru...@oss.nttdata.co.jp Date: Monday, November 3, 2014 at 11:11 AM To: Terry Siu terry@smartfocus.commailto:terry@smartfocus.com, user

Re: ParquetFilters and StringType support for GT, GTE, LT, LTE

2014-11-03 Thread Terry Siu
Done. https://issues.apache.org/jira/browse/SPARK-4213 Thanks, -Terry From: Michael Armbrust mich...@databricks.commailto:mich...@databricks.com Date: Monday, November 3, 2014 at 1:37 PM To: Terry Siu terry@smartfocus.commailto:terry@smartfocus.com Cc: user@spark.apache.orgmailto:user

Spark Build

2014-10-31 Thread Terry Siu
not find MemLimitLogger anywhere in the Spark code. Anybody else seen/encounter this? Thanks, -Terry

Re: Spark Build

2014-10-31 Thread Terry Siu
Thanks for the update, Shivaram. -Terry On 10/31/14, 12:37 PM, Shivaram Venkataraman shiva...@eecs.berkeley.edu wrote: Yeah looks like https://github.com/apache/spark/pull/2744 broke the build. We will fix it soon On Fri, Oct 31, 2014 at 12:21 PM, Terry Siu terry@smartfocus.com wrote: I

Re: Ambiguous references to id : what does it mean ?

2014-10-30 Thread Terry Siu
an Unresolved attributes error back. Is there any way around this short of renaming the columns in the join sources? Thanks -Terry Michael Armbrust wrote Yes, but if both tagCollection and selectedVideos have a column named id then Spark SQL does not know which one you are referring to in the where

Re: SparkSQL - TreeNodeException for unresolved attributes

2014-10-21 Thread Terry Siu
Just to follow up, the queries worked against master and I got my whole flow rolling. Thanks for the suggestion! Now if only Spark 1.2 will come out with the next release of CDH5 :P -Terry From: Terry Siu terry@smartfocus.commailto:terry@smartfocus.com Date: Monday, October 20, 2014

Re: SparkSQL IndexOutOfBoundsException when reading from Parquet

2014-10-20 Thread Terry Siu
Hi Yin, Sorry for the delay, but I’ll try the code change when I get a chance, but Michael’s initial response did solve my problem. In the meantime, I’m hitting another issue with SparkSQL which I will probably post another message if I can’t figure a workaround. Thanks, -Terry From: Yin

SparkSQL - TreeNodeException for unresolved attributes

2014-10-20 Thread Terry Siu
source tables. Help? Thanks, -Terry

Re: SparkSQL - TreeNodeException for unresolved attributes

2014-10-20 Thread Terry Siu
Hi Michael, Thanks again for the reply. Was hoping it was something I was doing wrong in 1.1.0, but I’ll try master. Thanks, -Terry From: Michael Armbrust mich...@databricks.commailto:mich...@databricks.com Date: Monday, October 20, 2014 at 12:11 PM To: Terry Siu terry

Re: SparkSQL IndexOutOfBoundsException when reading from Parquet

2014-10-15 Thread Terry Siu
. Let me know if you need more information. Thanks -Terry From: Yin Huai huaiyin@gmail.commailto:huaiyin@gmail.com Date: Tuesday, October 14, 2014 at 6:29 PM To: Terry Siu terry@smartfocus.commailto:terry@smartfocus.com Cc: Michael Armbrust mich...@databricks.commailto:mich

Re: SparkSQL IndexOutOfBoundsException when reading from Parquet

2014-10-14 Thread Terry Siu
Hi Michael, That worked for me. At least I’m now further than I was. Thanks for the tip! -Terry From: Michael Armbrust mich...@databricks.commailto:mich...@databricks.com Date: Monday, October 13, 2014 at 5:05 PM To: Terry Siu terry@smartfocus.commailto:terry@smartfocus.com Cc: user

SparkSQL IndexOutOfBoundsException when reading from Parquet

2014-10-13 Thread Terry Siu
defined. Does this error look familiar to anyone? Could my usage of SparkSQL with Hive be incorrect or is support with Hive/Parquet/partitioning still buggy at this point in Spark 1.1.0? Thanks, -Terry