Does the machine have cron job that periodically cleans up /tmp dir ?
Cheers
On Thu, Mar 12, 2015 at 6:18 PM, sequoiadb mailing-list-r...@sequoiadb.com
wrote:
Checking the script, it seems spark-daemon.sh unable to stop the worker
$ ./spark-daemon.sh stop org.apache.spark.deploy.worker.Worker
Can you try giving Spark driver more heap ?
Cheers
On Mar 25, 2015, at 2:14 AM, Todd Leo sliznmail...@gmail.com wrote:
Hi,
I am using Spark SQL to query on my Hive cluster, following Spark SQL and
DataFrame Guide step by step. However, my HiveQL via sqlContext.sql() fails
and
For Gradle, there are:
https://github.com/musketyr/gradle-fatjar-plugin
https://github.com/johnrengelman/shadow
FYI
On Sun, Mar 29, 2015 at 4:29 PM, jay vyas jayunit100.apa...@gmail.com
wrote:
thanks for posting this! Ive ran into similar issues before, and generally
its a bad idea to swap
Nathan:
Please look in log files for any of the following:
doCleanupRDD():
case e: Exception = logError(Error cleaning RDD + rddId, e)
doCleanupShuffle():
case e: Exception = logError(Error cleaning shuffle + shuffleId,
e)
doCleanupBroadcast():
case e: Exception =
Jenkins build failed too:
https://amplab.cs.berkeley.edu/jenkins/view/Spark/job/Spark-1.3-Maven-with-YARN/HADOOP_PROFILE=hadoop-2.4,label=centos/326/consoleFull
For the moment, you can apply the following change:
diff --git
bq. In /etc/secucity/limits.conf set the next values:
Have you done the above modification on all the machines in your Spark
cluster ?
If you use Ubuntu, be sure that the /etc/pam.d/common-session file contains
the following line:
session required pam_limits.so
On Mon, Mar 30, 2015 at 5:08
Nicolas:
See if there was occurrence of the following exception in the log:
errs = throw new SparkException(
sCouldn't connect to leader for topic ${part.topic}
${part.partition}: +
errs.mkString(\n)),
Cheers
On Mon, Mar 30, 2015 at 9:40 AM, Cody Koeninger
Which Spark and Hive release are you using ?
Thanks
On Mar 27, 2015, at 2:45 AM, Masf masfwo...@gmail.com wrote:
Hi.
In HiveContext, when I put this statement DROP TABLE IF EXISTS TestTable
If TestTable doesn't exist, spark returns an error:
ERROR Hive:
You can use broadcast variable.
See also this thread:
http://search-hadoop.com/m/JW1q5GX7U22/Spark+broadcast+variablesubj=How+Broadcast+variable+scale+
On Mar 31, 2015, at 4:43 AM, Peng Xia sparkpeng...@gmail.com wrote:
Hi,
I have a RDD (rdd1)where each line is split into an array [a,
10 --topic toto
kafka-topics --create --zookeeper localhost:2181 --replication-factor 1
--partitions 1 --topic toto-single
I'm launching my Spark Streaming in local mode.
@Ted Yu There's no log Couldn't connect to leader for topic, here's the
full version :
spark-submit --conf
+1 on escaping column names.
On Apr 1, 2015, at 5:50 AM, fergjo00 johngfergu...@gmail.com wrote:
Question:
---
Is there a way to have JDBC DataFrames use quoted/escaped column names?
Right now, it looks like it sees the names correctly in the schema created
but does not
Please invoke dev/change-version-to-2.11.sh before running mvn.
Cheers
On Mon, Mar 30, 2015 at 1:02 AM, Night Wolf nightwolf...@gmail.com wrote:
Hey,
Trying to build Spark 1.3 with Scala 2.11 supporting yarn hive (with
thrift server).
Running;
*mvn -e -DskipTests -Pscala-2.11
jamborta :
Please also describe the format of your csv files.
Cheers
On Fri, Mar 27, 2015 at 6:42 AM, DW @ Gmail deanwamp...@gmail.com wrote:
Show us the code. This shouldn't happen for the simple process you
described
Sent from my rotary phone.
On Mar 27, 2015, at 5:47 AM, jamborta
Jeetendra:
Please extract the information you need from Result and return the
extracted portion - instead of returning Result itself.
Cheers
On Tue, Mar 31, 2015 at 1:14 PM, Nan Zhu zhunanmcg...@gmail.com wrote:
The example in
Please take a look at
https://spark.apache.org/docs/latest/sql-programming-guide.html
Cheers
On Mar 28, 2015, at 5:08 AM, Vincent He vincent.he.andr...@gmail.com wrote:
I am learning spark sql and try spark-sql example, I running following code,
but I got exception ERROR CliDriver:
sample with scala
or python, but for spark-sql shell, I can not get an exmaple running
successfully, can you give me an example I can run with ./bin/spark-sql
without writing any code? thanks
On Sat, Mar 28, 2015 at 7:35 AM, Ted Yu yuzhih...@gmail.com wrote:
Please take a look at
https
there would be an exclusion in the pom to deal with this.
Dale.
From: Zhan Zhang zzh...@hortonworks.com
Date: Friday, March 27, 2015 at 4:28 PM
To: Johnson, Dale daljohn...@ebay.com
Cc: Ted Yu yuzhih...@gmail.com, user user@spark.apache.org
Subject: Re: Can't access file in spark, but can
Looking at SparkSubmit#addJarToClasspath():
uri.getScheme match {
case file | local =
...
case _ =
printWarning(sSkip remote jar $uri.)
It seems hdfs scheme is not recognized.
FYI
On Thu, Feb 26, 2015 at 6:09 PM, dilm dmend...@exist.com wrote:
I'm trying to run a
Have you tried adding the following ?
import org.apache.spark.sql.SQLContext
Cheers
On Mon, Mar 23, 2015 at 6:45 AM, IT CTO goi@gmail.com wrote:
Thanks.
I am new to the environment and running cloudera CDH5.3 with spark in it.
apparently when running in spark-shell this command val
I thought of formation #1.
But looks like when there're many fields, formation #2 is cleaner.
Cheers
On Sun, Mar 22, 2015 at 8:14 PM, Cheng Lian lian.cs@gmail.com wrote:
You need either
.map { row =
(row(0).asInstanceOf[Float], row(1).asInstanceOf[Float], ...)
}
or
.map { case
In this thread:
http://search-hadoop.com/m/JW1q5DM69G
I only saw two replies. Maybe some people forgot to use 'Reply to All' ?
Cheers
On Mon, Mar 23, 2015 at 8:19 AM, mrm ma...@skimlinks.com wrote:
Hi,
I have received three replies to my question on my personal e-mail, why
don't they also
bq. Cause was: akka.remote.InvalidAssociation: Invalid address:
akka.tcp://sparkMaster@localhost:7077
There should be some more output following the above line.
Can you post them ?
Cheers
On Mon, Mar 2, 2015 at 2:06 PM, Krishnanand Khambadkone
kkhambadk...@yahoo.com.invalid wrote:
Hi, I am
Here is snippet of dependency tree for spark-hive module:
[INFO] org.apache.spark:spark-hive_2.10:jar:1.3.0-SNAPSHOT
...
[INFO] +- org.spark-project.hive:hive-metastore:jar:0.13.1a:compile
[INFO] | +- org.spark-project.hive:hive-shims:jar:0.13.1a:compile
[INFO] | | +-
Default RM Web UI port is 8088 (configurable
through yarn.resourcemanager.webapp.address)
Cheers
On Mon, Mar 2, 2015 at 4:14 PM, Anupama Joshi anupama.jo...@gmail.com
wrote:
Hi Marcelo,
Thanks for the quick reply.
I have a EMR cluster and I am running the spark-submit on the master node
in
-20150302155433- is FAILED
On Monday, March 2, 2015 2:42 PM, Ted Yu yuzhih...@gmail.com wrote:
bq. Cause was: akka.remote.InvalidAssociation: Invalid address:
akka.tcp://sparkMaster@localhost:7077
There should be some more output following the above line.
Can you post them ?
Cheers
bq. spark UI does not work for Yarn-cluster.
Can you be a bit more specific on the error(s) you saw ?
What Spark release are you using ?
Cheers
On Tue, Mar 3, 2015 at 8:53 AM, Rohini joshi roni.epi...@gmail.com wrote:
Sorry , for half email - here it is again in full
Hi ,
I have 2
dataset we were using locally. It took me a couple
days and digging through many logs to figure out this value was what was
causing the problem.
On Sat, Feb 28, 2015 at 11:38 AM, Ted Yu yuzhih...@gmail.com wrote:
Having good out-of-box experience is desirable.
+1 on increasing the default
, Ted Yu yuzhih...@gmail.com wrote:
I have created SPARK-6085 with pull request:
https://github.com/apache/spark/pull/4836
Cheers
On Sat, Feb 28, 2015 at 12:08 PM, Corey Nolet cjno...@gmail.com wrote:
+1 to a better default as well.
We were working find until we ran against a real dataset
to the external one , but
still does not work.
Thanks
_roni
On Tue, Mar 3, 2015 at 9:05 AM, Ted Yu yuzhih...@gmail.com wrote:
bq. spark UI does not work for Yarn-cluster.
Can you be a bit more specific on the error(s) you saw ?
What Spark release are you using ?
Cheers
On Tue, Mar 3, 2015 at 8:53
If you can use hadoop 2.6.0 binary, you can use s3a
s3a is being polished in the upcoming 2.7.0 release:
https://issues.apache.org/jira/browse/HADOOP-11571
Cheers
On Tue, Mar 3, 2015 at 9:44 AM, Ankur Srivastava ankur.srivast...@gmail.com
wrote:
Hi,
We recently upgraded to Spark 1.2.1 -
Please take a look at DoubleRDDFunctions.scala :
/** Compute the mean of this RDD's elements. */
def mean(): Double = stats().mean
/** Compute the variance of this RDD's elements. */
def variance(): Double = stats().variance
/** Compute the standard deviation of this RDD's elements.
What release are you using ?
SPARK-3923 went into 1.2.0 release.
Cheers
On Wed, Mar 4, 2015 at 1:39 PM, Thomas Gerber thomas.ger...@radius.com
wrote:
Hello,
sometimes, in the *middle* of a job, the job stops (status is then seen
as FINISHED in the master).
There isn't anything wrong in
Please follow SPARK-5654
On Wed, Mar 4, 2015 at 7:22 PM, Haopu Wang hw...@qilinsoft.com wrote:
Thanks, it's an active project.
Will it be released with Spark 1.3.0?
--
*From:* 鹰 [mailto:980548...@qq.com]
*Sent:* Thursday, March 05, 2015 11:19 AM
*To:*
Please add the following to build command:
-Djackson.version=1.9.3
Cheers
On Thu, Mar 5, 2015 at 10:04 AM, Todd Nist tsind...@gmail.com wrote:
I am running Spark on a HortonWorks HDP Cluster. I have deployed there
prebuilt version but it is only for Spark 1.2.0 not 1.2.1 and there are a
few
to read multiple HDFS
files into RDD. What I am doing now is: for each file I read them into a
RDD. Then later on I union all these RDDs into one RDD. I am not sure
if it
is the best way to do it.
Thanks
Senqiang
On Tuesday, March 3, 2015 2:40 PM, Ted Yu yuzhih...@gmail.com wrote
:00 Ted Yu yuzhih...@gmail.com:
Looking at FileInputFormat#listStatus():
// Whether we need to recursive look into the directory structure
boolean recursive = job.getBoolean(INPUT_DIR_RECURSIVE, false);
where:
public static final String INPUT_DIR_RECURSIVE
Looking at scaladoc:
/** Get an RDD for a Hadoop file with an arbitrary new API InputFormat. */
def newAPIHadoopFile[K, V, F : NewInputFormat[K, V]]
Your conclusion is confirmed.
On Tue, Mar 3, 2015 at 1:59 PM, S. Zhou myx...@yahoo.com.invalid wrote:
I did some experiments and it seems
Having good out-of-box experience is desirable.
+1 on increasing the default.
On Sat, Feb 28, 2015 at 8:27 AM, Sean Owen so...@cloudera.com wrote:
There was a recent discussion about whether to increase or indeed make
configurable this kind of default fraction. I believe the suggestion
Have you verified that spark-catalyst_2.10 jar was in the classpath ?
Cheers
On Sat, Feb 28, 2015 at 9:18 AM, Ashish Nigam ashnigamt...@gmail.com
wrote:
Hi,
I wrote a very simple program in scala to convert an existing RDD to
SchemaRDD.
But createSchemaRDD function is throwing exception
Here was latest modification in spork repo:
Mon Dec 1 10:08:19 2014
Not sure if it is being actively maintained.
On Sat, Feb 28, 2015 at 6:26 PM, Qiang Cao caoqiang...@gmail.com wrote:
Thanks for the pointer, Ashish! I was also looking at Spork
https://github.com/sigmoidanalytics/spork
Environment (rhel-2.5.4.0.el6_6-x86_64 u75-b13)
OpenJDK 64-Bit Server VM (build 24.75-b04, mixed mode)
Thanks
*From:* Ted Yu [mailto:yuzhih...@gmail.com]
*Sent:* Sunday, March 01, 2015 10:18 PM
*To:* Zalzberg, Idan (Agoda)
*Cc:* user@spark.apache.org
*Subject:* Re: unsafe memory access
What Java version are you using ?
Thanks
On Sun, Mar 1, 2015 at 7:03 AM, Zalzberg, Idan (Agoda)
idan.zalzb...@agoda.com wrote:
Hi,
I am using spark 1.2.1, sometimes I get these errors sporadically:
Any thought on what could be the cause?
Thanks
2015-02-27 15:08:47 ERROR
Cui:
You can check messages.partitions.size to determine whether messages is an
empty RDD.
Cheers
On Thu, Mar 5, 2015 at 12:52 AM, Akhil Das ak...@sigmoidanalytics.com
wrote:
When you use KafkaUtils.createStream with StringDecoders, it will return
String objects inside your messages stream.
Installing hbase on hadoop cluster would allow hbase to utilize features
provided by hdfs, such as short circuit read (See '90.2. Leveraging local
data' under http://hbase.apache.org/book.html#perf.hdfs).
Cheers
On Sun, Feb 22, 2015 at 11:38 PM, Akhil Das ak...@sigmoidanalytics.com
wrote:
If
. The
question is really, how to get the classloader visibility right. It
depends on where you need these classes. Have you looked into
spark.files.userClassPathFirst and spark.yarn.user.classpath.first ?
On Wed, Feb 25, 2015 at 5:34 AM, Ted Yu yuzhih...@gmail.com wrote:
bq. depend on missing
Haven't found the method in
http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.SchemaRDD
The new DataFrame has this method:
/**
* Returns the content of the [[DataFrame]] as an [[RDD]] of [[Row]]s.
* @group rdd
*/
def rdd: RDD[Row] = {
FYI
On Sun, Feb
bq. i didnt get any new subscription mail in my inbox.
Have you checked your Spam folder ?
Cheers
On Sun, Feb 22, 2015 at 2:36 PM, hnahak harihar1...@gmail.com wrote:
I'm also facing the same issue, this is third time whenever I post anything
it never accept by the community and at the same
bq. bash: git: command not found
Looks like the AMI doesn't have git pre-installed.
Cheers
On Sun, Feb 22, 2015 at 4:29 PM, olegshirokikh o...@solver.com wrote:
I'm trying to launch Spark cluster on AWS EC2 with custom AMI (Ubuntu)
using
the following:
./ec2/spark-ec2 --key-pair=***
bq. Caused by: java.lang.ClassNotFoundException: com.rick.reports.Reports$
SensorReports
Is Reports$SensorReports class in rick-processors-assembly-1.0.jar ?
Thanks
On Mon, Feb 23, 2015 at 8:43 PM, necro351 . necro...@gmail.com wrote:
Hello,
I am trying to deserialize some data encoded
bq. have installed hadoop on a local virtual machine
Can you tell us the release of hadoop you installed ?
What Spark release are you using ? Or be more specific, what hadoop release
was the Spark built against ?
Cheers
On Mon, Feb 23, 2015 at 9:37 PM, fanooos dev.fano...@gmail.com wrote:
Hi
Can you pastebin the whole stack trace ?
Thanks
On Feb 23, 2015, at 6:14 PM, bit1...@163.com bit1...@163.com wrote:
Hi,
When I submit a spark streaming application with following script,
./spark-submit --name MyKafkaWordCount --master local[20] --executor-memory
512M
$Builder.class
10640 Mon Feb 23 17:34:46 PST 2015
com/defend7/reports/Reports$SensorReports.class
815 Mon Feb 23 17:34:46 PST 2015
com/defend7/reports/Reports$SensorReportsOrBuilder.class
On Mon Feb 23 2015 at 8:57:18 PM Ted Yu yuzhih...@gmail.com wrote:
bq. Caused
Here is a tool which may give you some clue:
http://file-leak-detector.kohsuke.org/
Cheers
On Tue, Feb 24, 2015 at 11:04 AM, Vladimir Rodionov
vrodio...@splicemachine.com wrote:
Usually it happens in Linux when application deletes file w/o double
checking that there are no open FDs (resource
Can you be a bit more specific ?
Are you asking about performance across Spark releases ?
Cheers
On Sat, Feb 21, 2015 at 6:38 AM, Deep Pradhan pradhandeep1...@gmail.com
wrote:
Hi,
Has some performance prediction work been done on Spark?
Thank You
Have you looked at
http://spark.apache.org/docs/1.2.0/api/scala/index.html#org.apache.spark.sql.SchemaRDD
?
Cheers
On Sat, Feb 21, 2015 at 4:24 AM, Nikhil Bafna nikhil.ba...@flipkart.com
wrote:
Hi.
My use case is building a realtime monitoring system over
multi-dimensional data.
The way
Could this be caused by Spark using shaded Guava jar ?
Cheers
On Wed, Feb 25, 2015 at 3:26 PM, Pat Ferrel p...@occamsmachete.com wrote:
Getting an error that confuses me. Running a largish app on a standalone
cluster on my laptop. The app uses a guava HashBiMap as a broadcast value.
With
Maybe drop the exclusion for parquet-provided profile ?
Cheers
On Wed, Feb 25, 2015 at 8:42 PM, Jim Kleckner j...@cloudphysics.com wrote:
Inline
On Wed, Feb 25, 2015 at 1:53 PM, Ted Yu yuzhih...@gmail.com wrote:
Interesting. Looking at SparkConf.scala :
val configs = Seq
bq. depend on missing fastutil classes like Long2LongOpenHashMap
Looks like Long2LongOpenHashMap should be added to the shaded jar.
Cheers
On Tue, Feb 24, 2015 at 7:36 PM, Jim Kleckner j...@cloudphysics.com wrote:
Spark includes the clearspring analytics package but intentionally excludes
bq. whether or not rdd1 is a cached rdd
RDD has getStorageLevel method which would return the RDD's current storage
level.
SparkContext has this method:
* Return information about what RDDs are cached, if they are in mem or
on disk, how much space
* they take, etc.
*/
@DeveloperApi
JettyUtils is marked with:
private[spark] object JettyUtils extends Logging {
FYI
On Fri, Mar 27, 2015 at 9:50 AM, kmader kevin.ma...@gmail.com wrote:
I have a very strange error in Spark 1.3 where at runtime in the
org.apache.spark.ui.JettyUtils object the method createServletHandler is
not
In examples//src/main/scala/org/apache/spark/examples/HBaseTest.scala,
TableInputFormat is used.
TableInputFormat accepts parameter
public static final String SCAN = hbase.mapreduce.scan;
where if specified, Scan object would be created from String form:
if (conf.get(SCAN) != null) {
This is related:
https://issues.apache.org/jira/browse/SPARK-6340
On Thu, Mar 26, 2015 at 5:58 AM, sergunok ser...@gmail.com wrote:
Hi guys,
I don't have exact picture about preserving of ordering of elements of RDD
after executing of operations.
Which operations preserve it?
1) Map
Looks like the following assertion failed:
Preconditions.checkState(storageIDsCount == locs.size());
locs is ListDatanodeInfoProto
Can you enhance the assertion to log more information ?
Cheers
On Thu, Mar 26, 2015 at 3:06 PM, Dale Johnson daljohn...@ebay.com wrote:
There seems to be a
Looking at output from dependency:tree, servlet-api is brought in by the
following:
[INFO] +- org.apache.cassandra:cassandra-all:jar:1.2.6:compile
[INFO] | +- org.antlr:antlr:jar:3.2:compile
[INFO] | +- com.googlecode.json-simple:json-simple:jar:1.1:compile
[INFO] | +-
JAVA_HOME, an environment variable, should be defined on the node where
appattempt_1420225286501_4699_02 ran.
Cheers
On Thu, Mar 19, 2015 at 8:59 AM, Williams, Ken ken.willi...@windlogics.com
wrote:
I’m trying to upgrade a Spark project, written in Scala, from Spark
1.2.1 to 1.3.0, so I
18, 2015 at 9:53 AM, Ranga sra...@gmail.com wrote:
Thanks for the information. Will rebuild with 0.6.0 till the patch is
merged.
On Tue, Mar 17, 2015 at 7:24 PM, Ted Yu yuzhih...@gmail.com wrote:
Ranga:
Take a look at https://github.com/apache/spark/pull/4867
Cheers
On Tue, Mar 17, 2015
What's the expected number of partitions in your use case ?
Have you thought of doing batching in the workers ?
Cheers
On Sat, Mar 7, 2015 at 10:54 PM, A.K.M. Ashrafuzzaman
ashrafuzzaman...@gmail.com wrote:
While processing DStream in the Spark Programming Guide, the suggested
usage of
InputSplit is in hadoop-mapreduce-client-core jar
Please check that the jar is in your classpath.
Cheers
On Mon, Mar 23, 2015 at 8:10 AM, , Roy rp...@njit.edu wrote:
Hi,
I am using CDH 5.3.2 packages installation through Cloudera Manager 5.3.2
I am trying to run one spark job with
bq. is to modify compute_classpath.sh on all worker nodes to include your
driver JARs.
Please follow the above advice.
Cheers
On Mon, Mar 23, 2015 at 12:34 PM, Jack Arenas j...@ckarenas.com wrote:
Hi Team,
I’m trying to create a DF using jdbc as detailed here
It is logged from RecurringTimer#loop():
private def loop() {
try {
while (!stopped) {
clock.waitTillTime(nextTime)
callback(nextTime)
prevTime = nextTime
nextTime += period
logDebug(Callback for + name + called at time + prevTime)
}
Looking at the source code for AbstractGenericUDAFResolver, the following
(non-deprecated) method should be called:
public GenericUDAFEvaluator getEvaluator(GenericUDAFParameterInfo info)
It is called by hiveUdfs.scala (master branch):
val parameterInfo = new
Looking at core/pom.xml :
dependency
groupIdorg.json4s/groupId
artifactIdjson4s-jackson_${scala.binary.version}/artifactId
version3.2.10/version
/dependency
The version is hard coded.
You can rebuild Spark 1.3.0 with json4s 3.2.11
Cheers
On Mon, Mar 23, 2015 at 2:12
Have you looked at http://happybase.readthedocs.org/en/latest/ ?
Cheers
On Apr 1, 2015, at 4:50 PM, Eric Kimbrel eric.kimb...@soteradefense.com
wrote:
I am attempting to read an hbase table in pyspark with a range scan.
conf = {
hbase.zookeeper.quorum: host,
http://docs.oracle.com/cd/B10500_01/java.920/a96654/connpoca.htm
The question doesn't seem to be Spark specific, btw
On Apr 2, 2015, at 4:45 AM, Sateesh Kavuri sateesh.kav...@gmail.com wrote:
Hi,
We have a case that we will have to run concurrent jobs (for the same
algorithm) on
Can you include -X in your maven command and pastebin the output ?
Cheers
On Apr 3, 2015, at 3:58 AM, myelinji myeli...@aliyun.com wrote:
Thank you for your reply. When I'm using maven to compile the whole project,
the erros as follows
[INFO] Spark Project Parent POM
Maybe add another stat for batches waiting in the job queue ?
Cheers
On Fri, Apr 3, 2015 at 10:01 AM, Tathagata Das t...@databricks.com wrote:
Very good question! This is because the current code is written such that
the ui considers a batch as waiting only when it has actually started being
performance of application?
Regards
Jeetendra
On 20 April 2015 at 20:49, Ted Yu yuzhih...@gmail.com wrote:
To my knowledge, Spark SQL currently doesn't provide range scan
capability against hbase.
Cheers
On Apr 20, 2015, at 7:54 AM, Jeetendra Gangele gangele...@gmail.com
wrote:
HI All
NativeS3FileSystem class is in hadoop-aws jar.
Looks like it was not on classpath.
Cheers
On Thu, Apr 23, 2015 at 7:30 AM, Sujee Maniyam su...@sujee.net wrote:
Thanks all...
btw, s3n load works without any issues with spark-1.3.1-bulit-for-hadoop
2.4
I tried this on 1.3.1-hadoop26
Shuai:
Please take a look at:
http://blog.takipi.com/garbage-collectors-serial-vs-parallel-vs-cms-vs-the-g1-and-whats-new-in-java-8/
On Apr 23, 2015, at 10:18 AM, Dean Wampler deanwamp...@gmail.com wrote:
JVM's often have significant GC overhead with heaps bigger than 64GB. You
might try
Have you tried the following ?
import sqlContext._
import sqlContext.implicits._
Cheers
On Tue, Apr 21, 2015 at 7:54 AM, Wang, Ningjun (LNG-NPV)
ningjun.w...@lexisnexis.com wrote:
I tried to convert an RDD to a data frame using the example codes on
spark website
case class
Does line 27 correspond to brdcst.value ?
Cheers
On Apr 21, 2015, at 3:19 AM, donhoff_h 165612...@qq.com wrote:
Hi, experts.
I wrote a very little program to learn how to use Broadcast Variables, but
met an exception. The program and the exception are listed as following.
Could
This thread from hadoop mailing list should give you some clue:
http://search-hadoop.com/m/LgpTk2df7822
On Wed, Apr 22, 2015 at 9:45 AM, Sujee Maniyam su...@sujee.net wrote:
Hi all
I am unable to access s3n:// urls using sc.textFile().. getting 'no
file system for scheme s3n://' error.
This thread seems related:
http://search-hadoop.com/m/JW1q51W02V
Cheers
On Wed, Apr 22, 2015 at 6:09 AM, James King jakwebin...@gmail.com wrote:
What's the best way to start-up a spark job as part of starting-up the
Spark cluster.
I have an single uber jar for my job and want to make the
In master branch, overhead is now 10%.
That would be 500 MB
FYI
On Apr 22, 2015, at 8:26 AM, nsalian neeleshssal...@gmail.com wrote:
+1 to executor-memory to 5g.
Do check the overhead space for both the driver and the executor as per
Wilfred's suggestion.
Typically, 384 MB should
Yin:
Fix Version of SPARK-4520 is not set.
I assume it was fixed in 1.3.0
Cheers
Fix Version
On Fri, Apr 24, 2015 at 11:00 AM, Yin Huai yh...@databricks.com wrote:
The exception looks like the one mentioned in
https://issues.apache.org/jira/browse/SPARK-4520. What is the version of
Spark?
Please see SPARK-2883
There is no Fix Version yet.
On Fri, Apr 24, 2015 at 5:45 PM, David Mitchell jdavidmitch...@gmail.com
wrote:
Does anyone know in which version of Spark will there be support for
ORCFiles via spark.sql.hive? Will it be in 1.4?
David
Looks like this is related:
https://issues.apache.org/jira/browse/SPARK-5456
On Sat, Apr 25, 2015 at 6:59 AM, doovs...@sina.com wrote:
Hi all,
When I query Postgresql based on Spark SQL like this:
dataFrame.registerTempTable(Employees)
val emps = sqlContext.sql(select name,
For step 2, you can pipe application log to a file instead of copy-pasting.
Cheers
On Apr 22, 2015, at 10:48 PM, ÐΞ€ρ@Ҝ (๏̯͡๏) deepuj...@gmail.com wrote:
I submit a spark app to YARN and i get these messages
15/04/22 22:45:04 INFO yarn.Client: Application report for
To my knowledge, Spark SQL currently doesn't provide range scan capability
against hbase.
Cheers
On Apr 20, 2015, at 7:54 AM, Jeetendra Gangele gangele...@gmail.com wrote:
HI All,
I am Querying Hbase and combining result and using in my spake job.
I am querying hbase using Hbase
What JDK release are you using ?
Can you give the complete command you used ?
Which Spark branch are you working with ?
Cheers
On Sun, Apr 19, 2015 at 7:25 PM, Brahma Reddy Battula
brahmareddy.batt...@huawei.com wrote:
Hi All
Getting following error, when I am compiling spark..What did I
Can you give us more information ?
Such as hbase release, Spark release.
If you can pastebin jstack of the hanging HTable process, that would help.
BTW I used http://search-hadoop.com/?q=spark+HBase+HTable+constructor+hangs
and saw a very old thread with this subject.
Cheers
On Tue, Apr 28,
Credit goes to Misha Chernetsov (see SPARK-4925)
FYI
On Tue, Apr 28, 2015 at 8:25 AM, Marco marco@gmail.com wrote:
Thx Ted for the info !
2015-04-27 23:51 GMT+02:00 Ted Yu yuzhih...@gmail.com:
This is available for 1.3.1:
http://mvnrepository.com/artifact/org.apache.spark/spark-hive
How did you distribute hbase-site.xml to the nodes ?
Looks like HConnectionManager couldn't find the hbase:meta server.
Cheers
On Tue, Apr 28, 2015 at 9:19 PM, Tridib Samanta tridib.sama...@live.com
wrote:
I am using Spark 1.2.0 and HBase 0.98.1-cdh5.1.0.
Here is the jstack trace. Complete
Can you run the command 'ulimit -n' to see the current limit ?
To configure ulimit settings on Ubuntu, edit */etc/security/limits.conf*
Cheers
On Wed, Apr 29, 2015 at 2:07 PM, Bill Jay bill.jaypeter...@gmail.com
wrote:
Hi all,
I am using the direct approach to receive real-time data from
(sql)
} catch {
case e: Exception = logger.error(e.getMessage())
} finally {
if (conn != null) {
conn.close
}
}
}
I am not sure whether the leakage originates from Kafka connector or the
sql connections.
Bill
On Wed, Apr 29, 2015 at 2:12 PM, Ted
bq. a single query on one filter criteria
Can you tell us more about your filter ? How selective is it ?
Which hbase release are you using ?
Cheers
On Thu, Apr 30, 2015 at 7:23 AM, Siddharth Ubale
siddharth.ub...@syncoms.com wrote:
Hi,
I want to use Spark as Query engine on HBase with
)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:111)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
On Thu, Apr 30, 2015 at 1:25 PM, Saurabh Gupta saurabh.gu...@semusi.com
wrote:
I am using hbase -0.94.8.
On Wed, Apr 29, 2015 at 11:56 PM, Ted Yu yuzhih...@gmail.com
This is available for 1.3.1:
http://mvnrepository.com/artifact/org.apache.spark/spark-hive-thriftserver_2.10
FYI
On Mon, Feb 16, 2015 at 7:24 AM, Marco marco@gmail.com wrote:
Ok, so will it be only available for the next version (1.30)?
2015-02-16 15:24 GMT+01:00 Ted Yu yuzhih
Which hadoop release are you using ?
Can you check hdfs audit log to see who / when deleted spark/ck/hdfsaudit/
receivedData/0/log-1430139541443-1430139601443 ?
Cheers
On Mon, Apr 27, 2015 at 6:21 AM, Sea 261810...@qq.com wrote:
Hi, all:
I use function updateStateByKey in Spark Streaming, I
Can you try the patch from:
[SPARK-6913][SQL] Fixed java.sql.SQLException: No suitable driver found
Cheers
On Sat, Mar 28, 2015 at 12:41 AM, ÐΞ€ρ@Ҝ (๏̯͡๏) deepuj...@gmail.com wrote:
This is from my Hive installation
-sh-4.1$ ls /apache/hive/lib | grep derby
derby-10.10.1.1.jar
201 - 300 of 1611 matches
Mail list logo