Yes, it contain one line
On Wed, Aug 26, 2015 at 8:20 PM, Yin Huai-2 [via Apache Spark Developers
List] ml-node+s1001551n13852...@n3.nabble.com wrote:
The JSON support in Spark SQL handles a file with one JSON object per line
or one JSON array of objects per line. What is the format your file?
we build on jenkins w/3.1.1, but also have 3.0.4.
On Wed, Aug 26, 2015 at 8:18 AM, Sean Owen so...@cloudera.com wrote:
It sounds like you're doing the right things. I believe the Jenkins
test machines also have 3.0.4, but successfully build by using
build/mvn --force. Not sure what to make of
The JSON support in Spark SQL handles a file with one JSON object per line
or one JSON array of objects per line. What is the format your file? Does
it only contain a single line?
On Wed, Aug 26, 2015 at 6:47 AM, gsvic victora...@gmail.com wrote:
Hi,
I have the following issue. I am trying to
Any reason why you have more than 2G in a single line?
There is a limit of 2G in the Hadoop library we use. Also the JVM doesn't
work when your string is that long.
On Wed, Aug 26, 2015 at 11:38 AM, gsvic victora...@gmail.com wrote:
Yes, it contain one line
On Wed, Aug 26, 2015 at 8:20 PM,
rxin wrote
The release files, including signatures, digests, etc. can be found at:
http://people.apache.org/~pwendell/spark-releases/spark-1.5.0-rc2-bin/
Release artifacts are signed with the following key:
https://people.apache.org/keys/committer/pwendell.asc
I was looking
- tested the backpressure/rate controlling in streaming. It works as
expected.
- there is a problem with the Scala 2.11 sbt build:
https://issues.apache.org/jira/browse/SPARK-10227
Luc Bourlier
Luc Bourlier
*Spark Team - Typesafe, Inc.*
luc.bourl...@typesafe.com
http://www.typesafe.com
On
My understanding is that people on this mailing list who are interested to
help can log comments on the GORA JIRA.
HBase integration with Spark is proven to work. So the intricacies should
be on Gora side.
On Wed, Aug 26, 2015 at 8:08 AM, Furkan KAMACI furkankam...@gmail.com
wrote:
Btw, here is
My quick take: no blockers at this point, except for one potential
issue. Still some 'critical' bugs worth a look. The release seems to
pass tests but i get a lot of spurious failures; it took about 16
hours of running tests to get everything to pass at least once.
Current score: 56 issues
+1, tested that 1.5.0-RC2 works with Tachyon 0.7.1 as external block store.
One small update -- the vote should close Saturday Aug 29. Not Friday Aug
29.
On Tue, Aug 25, 2015 at 9:28 PM, Reynold Xin r...@databricks.com wrote:
Please vote on releasing the following candidate as Apache Spark version
1.5.0. The vote is open until Friday, Aug 29, 2015 at 5:00 UTC and
The Scala 2.11 issue should be fixed, but doesn't need to be a blocker,
since Maven builds fine. The sbt build is more aggressive to make sure we
catch warnings.
On Wed, Aug 26, 2015 at 10:01 AM, Sean Owen so...@cloudera.com wrote:
My quick take: no blockers at this point, except for one
I ran into the same error (different dependency) earlier today. In my
case, the maven pom files and the sbt dependencies had a conflict
(different versions of the same artifact) and ivy got confused. Not
sure whether that will help in your case or not...
On Wed, Aug 26, 2015 at 2:23 PM, Holden
Hi Ted,
You can check full stack trace log from the attachment at Jira:
https://issues.apache.org/jira/browse/GORA-386
Kind Regards,
Furkan KAMACI
On Wed, Aug 26, 2015 at 6:55 PM, Ted Yu yuzhih...@gmail.com wrote:
My understanding is that people on this mailing list who are interested to
Has anyone else run into impossible to get artifacts when data has not
been loaded. IvyNode = org.scala-lang#scala-library;2.10.3 during
hive/update when building with sbt. Working around it is pretty simple
(just add it as a dependency), but I'm wondering if its impacting anyone
else and I should
This looks promising. I'm trying to use spark-ec2 to launch a cluster with
Spark 1.5.0-SNAPSHOT and failing.
Where should we ask questions, report problems?
I couple of questions I have already after looking through the project:
- Where does the configuration file /spark-deployer.conf/ go
Hi,
I start an Hbase cluster for my test class. I use that helper class:
https://github.com/apache/gora/blob/master/gora-hbase/src/test/java/org/apache/gora/hbase/util/HBaseClusterSingleton.java
and use it as like that:
private static final HBaseClusterSingleton cluster =
So, I actually tried this, and it built without problems, but publishing the
artifacts to artifactory ended up with some strangeness in the child poms,
where the property wasn’t resolved. This leads to issues pulling them into
other projects of: “Could not find
The connection failure was to zookeeper.
Have you verified that localhost:2181 can serve requests ?
What version of hbase was Gora built against ?
Cheers
On Aug 26, 2015, at 1:50 AM, Furkan KAMACI furkankam...@gmail.com wrote:
Hi,
I start an Hbase cluster for my test class. I use that
Can you log the contents of the Configuration you pass from Spark ?
The output would give you some clue.
Cheers
On Aug 26, 2015, at 2:30 AM, Furkan KAMACI furkankam...@gmail.com wrote:
Hi Ted,
I'll check Zookeeper connection but another test method which runs on hbase
without Spark
I've always used HBaseTestingUtility and never really had much trouble. I
use that for all my unit testing between Spark and HBase.
Here are some code examples if your interested
--Main HBase-Spark Module
https://github.com/apache/hbase/tree/master/hbase-spark
--Unit test that cover all basic
Hi,
Here is the test method I've ignored due to Connection Refused problem
failure:
https://github.com/kamaci/gora/blob/master/gora-hbase/src/test/java/org/apache/gora/hbase/mapreduce/TestHBaseStoreWordCount.java#L65
I've implemented a Spark backend for Apache Gora as GSoC project and this
is
Where is the input format class. When every I use the search on your
github it says We couldn’t find any issues matching 'GoraInputFormat'
On Wed, Aug 26, 2015 at 9:48 AM, Furkan KAMACI furkankam...@gmail.com
wrote:
Hi,
Here is the MapReduceTestUtils.testSparkWordCount()
Please ask questions at the gitter channel for now.
https://gitter.im/pishen/spark-deployer
- spark-deployer.conf should be placed in your project's root directory
(beside build.sbt)
- To use the nightly builds, you can replace the value of spark-tgz-url
in spark-deployer.conf to the tgz you want
Hi Ted,
I'll check Zookeeper connection but another test method which runs on hbase
without Spark works without any error. Hbase version is 0.98.8-hadoop2 and
I use Spark 1.3.1
Kind Regards,
Furkan KAMACI
26 Ağu 2015 12:08 tarihinde Ted Yu yuzhih...@gmail.com yazdı:
The connection failure was
Where can I find the code for MapReduceTestUtils.testSparkWordCount?
On Wed, Aug 26, 2015 at 9:29 AM, Furkan KAMACI furkankam...@gmail.com
wrote:
Hi,
Here is the test method I've ignored due to Connection Refused problem
failure:
Hi,
I have the following issue. I am trying to load a 2.5G JSON file from a
10-node Hadoop Cluster. Actually, I am trying to create a DataFrame, using
sqlContext.read.json(hdfs://master:9000/path/file.json).
The JSON file contains a parsed table(relation) from the TPCH benchmark.
After
Hi,
Here is the MapReduceTestUtils.testSparkWordCount()
https://github.com/kamaci/gora/blob/master/gora-core/src/test/java/org/apache/gora/mapreduce/MapReduceTestUtils.java#L108
Here is SparkWordCount
I found GORA-386 Gora Spark Backend Support
Should the discussion be continued there ?
Cheers
On Wed, Aug 26, 2015 at 7:02 AM, Ted Malaska ted.mala...@cloudera.com
wrote:
Where is the input format class. When every I use the search on your
github it says We couldn’t find any issues matching
Btw, here is the source code of GoraInputFormat.java :
https://github.com/kamaci/gora/blob/master/gora-core/src/main/java/org/apache/gora/mapreduce/GoraInputFormat.java
26 Ağu 2015 18:05 tarihinde Furkan KAMACI furkankam...@gmail.com yazdı:
I'll send an e-mail to Gora dev list too and also
Currently trying to compile 1.5-RC2 (from
https://github.com/apache/spark/commit/727771352855dbb780008c449a877f5aaa5fc27a)
and running into issues with the new Maven requirement. I have 3.0.4 installed
at the system level, 1.5 requires 3.3.3. As Patrick has pointed out in other
places, this
It sounds like you're doing the right things. I believe the Jenkins
test machines also have 3.0.4, but successfully build by using
build/mvn --force. Not sure what to make of that.
On Wed, Aug 26, 2015 at 4:08 PM, Chris Freeman cfree...@alteryx.com wrote:
Currently trying to compile 1.5-RC2
No, I created the file by appending each JSON record in a loop without
changing line. I've just changed that and now it works fine. Thank you very
much for your support.
--
View this message in context:
I've noticed that two queries, which return identical results, have very
different performance. I'd be interested in any hints about how avoid
problems like this.
The DataFrame df contains a string field series and an integer eday, the
number of days since (or before) the 1970-01-01 epoch.
I'm
I ran into a similar problem while working on the spark-redshift library
and was able to fix it by bumping that library's ScalaTest version. I'm
still fighting some mysterious Scala issues while trying to test the
spark-csv library against 1.5.0-RC1, so it's possible that a build or
dependency
Hi,
We released a package called LLQL, which is a serialization of operators of
relational algebra. Spark SQL Plan is the first one supported.
More interesting to the spark community probably is our test that
implements TPCH. We manually rewrote some sql -- mainly pulling subqueries
out and
35 matches
Mail list logo