Hi,
I used to run spark scripts on local machine. Now i am porting my codes to
EMR and i am facing lots of problem.
The main one now is that the spark script which is running properly on my
local machine is giving error when run on Amazon EMR Cluster.
Here is the error:
[image: Inline image 1]
)
at
org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:820)
at
org.apache.hadoop.io.compress.CompressionCodecFactory.getCodecClasses(CompressionCodecFactory.java:89)
... 64 more
On Sun, Aug 3, 2014 at 6:04 PM, Rahul Bhojwani rahulbhojwani2...@gmail.com
wrote:
Hi,
I
with SparkContext.wholeTextFiles
and call Weka on each one.
Matei
On Jul 14, 2014, at 11:30 AM, Rahul Bhojwani rahulbhojwani2...@gmail.com
wrote:
I understand that the question is very unprofessional, but I am a newbie.
If you could share some link where I can ask such questions, if not here.
But please
I am getting an error saying:
Exception in thread delete Spark temp dir
C:\Users\shawn\AppData\Local\Temp\spark-b4f1105c-d67b-488c-83f9-eff1d1b95786
java.io.IOExcept
ion: Failed to delete:
C:\Users\shawn\AppData\Local\Temp\spark-b4f1105c-d67b-488c-83f9-eff1d1b95786\tmppr36zu
at
Hey, My question is for this situation:
Suppose we have 10 files each containing list of features in each row.
Task is that for each file cluster the features in that file and write the
corresponding cluster along with it in a new file. So we have to generate
10 more files by applying
I understand that the question is very unprofessional, but I am a newbie.
If you could share some link where I can ask such questions, if not here.
But please answer.
On Mon, Jul 14, 2014 at 6:52 PM, Rahul Bhojwani rahulbhojwani2...@gmail.com
wrote:
Hey, My question is for this situation
there is a smoothing parameter, and yes from the looks of it it is
simply additive / Laplace smoothing. It's been in there for a while.
On Thu, Jul 10, 2014 at 6:55 AM, Rahul Bhojwani
rahulbhojwani2...@gmail.com wrote:
The discussion is in context for spark 0.9.1
Does MLlib Naive Bayes
(Bug according to me.)
I m not trying to be selfish. Its just that if I get something that can
help make my profile look strong then I shouldn't miss it at this stage.
Thanks,
On Thu, Jul 10, 2014 at 5:54 PM, Rahul Bhojwani rahulbhojwani2...@gmail.com
wrote:
Ya thanks. I can see that lambda
/jira/browse/SPARK/
Bertrand
On Thu, Jul 10, 2014 at 2:37 PM, Rahul Bhojwani
rahulbhojwani2...@gmail.com wrote:
And also that there is a small bug in implementation. As I mentioned this
earlier also.
This is my first time I am reporting some bug. So I just wanted to ask,
that do your name
I am using Naive Bayes in MLlib .
Below I have printed log of *model.theta*. after training on train data.
You can check that it contains 9 features for 2 class classification.
print numpy.log(model.theta)
[[ 0.31618962 0.16636852 0.07200358 0.05411449 0.08542039 0.17620751
0.03711986
According to me there is BUG in MLlib Naive Bayes implementation in spark
0.9.1.
Whom should I report this to or with whom should I discuss? I can discuss
this over call as well.
My Skype ID : rahul.bhijwani
Phone no: +91-9945197359
Thanks,
--
Rahul K Bhojwani
3rd Year B.Tech
Computer
The discussion is in context for spark 0.9.1
Does MLlib Naive Bayes implementation incorporates Laplase smoothing? Or
any other smoothing? Or it doesn't encorporates any smoothing?? Please
inform?
Thanks,
--
Rahul K Bhojwani
3rd Year B.Tech
Computer Science and Engineering
National Institute of
Hello,
I am a novice.I want to classify the text into two classes. For this
purpose I want to use Naive Bayes model. I am using Python for it.
Here are the problems I am facing:
*Problem 1:* I wanted to use all words as features for the bag of words
model. Which means my features will be count
Hi,
I wanted to use Naive Bayes for a text classification problem.I am using
Spark 0.9.1.
I was just curious to ask that is the Naive Bayes implementation in Spark
0.9.1 correct? Or are there any bugs in the Spark 0.9.1 implementation
which are taken care in Spark 1.0. My question is specific
Hi,
I am using the MLlib Naive Bayes for a text classification problem. I have
very less amount of training data. And then the data will be coming
continuously and I need to classify it as either A or B. I am training the
MLlib Naive Bayes model using the training data but next time when data
HI,
I am getting this error. Can anyone help out to explain why is this error
coming.
Exception in thread delete Spark temp dir
C:\Users\shawn\AppData\Local\Temp\spark-27f60467-36d4-4081-aaf5-d0ad42dda560
java.io.IOException: Failed to delete:
of your executor being killed. For
example, Yarn will do that if you're going over the requested memory
limits.
On Tue, Jul 8, 2014 at 12:17 PM, Rahul Bhojwani
rahulbhojwani2...@gmail.com wrote:
HI,
I am getting this error. Can anyone help out to explain why is this error
coming
(train_data)))
file_predicted.write(msg + ## + sentiment + \n)
file_predicted.close()
###
If you can have a look at the code and help me out, It would be great
Thanks
On Wed, Jul 9, 2014 at 12:54 AM, Rahul Bhojwani rahulbhojwani2...@gmail.com
wrote:
Hi Marcelo.
Thanks
)
These are the logs. Can you suggest something after looking at it.
On Wed, Jul 9, 2014 at 1:10 AM, Rahul Bhojwani rahulbhojwani2...@gmail.com
wrote:
Here I am adding my code. If you can have a look to help me out.
Thanks
###
import
to update the priors and conditional
probabilities, which means we should also remember the number of
observations for the updates.
Best,
Xiangrui
On Tue, Jul 8, 2014 at 7:35 AM, Rahul Bhojwani
rahulbhojwani2...@gmail.com wrote:
Hi,
I am using the MLlib Naive Bayes for a text classification
the case for text
classificiation. I would recommend upgrading to v1.0. -Xiangrui
On Tue, Jul 8, 2014 at 7:20 AM, Rahul Bhojwani
rahulbhojwani2...@gmail.com wrote:
Hi,
I wanted to use Naive Bayes for a text classification problem.I am using
Spark 0.9.1.
I was just curious to ask
need summation.
Best,
Xiangrui
On Tue, Jul 8, 2014 at 12:01 AM, Rahul Bhojwani
rahulbhojwani2...@gmail.com wrote:
I am really sorry. Its actually my mistake. My problem 2 is wrong because
using a single feature is a senseless thing. Sorry for the inconvenience.
But still I will be waiting
program, so Spark
can clean up after itself?
On Tue, Jul 8, 2014 at 12:40 PM, Rahul Bhojwani
rahulbhojwani2...@gmail.com wrote:
Here I am adding my code. If you can have a look to help me out.
Thanks
###
import tokenizer
import gettingWordLists as gl
from
Hi,
I recently posted a question on stackoverflow but didn't get any reply. I
joined the mailing list now. Can anyone of you guide me a way for the
problem mentioned in
http://stackoverflow.com/questions/23923966/writing-the-rdd-data-in-excel-file-along-mapping-in-apache-spark
Thanks in advance
Thanks Marcelo,
It actually made my few concepts clear. (y).
On Fri, May 30, 2014 at 10:14 PM, Marcelo Vanzin van...@cloudera.com
wrote:
Hello there,
On Fri, May 30, 2014 at 9:36 AM, Marcelo Vanzin van...@cloudera.com
wrote:
workbook = xlsxwriter.Workbook('output_excel.xlsx')
worksheet
Thanks jey
I was hellpful.
On Sat, May 31, 2014 at 12:45 AM, Rahul Bhojwani
rahulbhojwani2...@gmail.com wrote:
Thanks Marcelo,
It actually made my few concepts clear. (y).
On Fri, May 30, 2014 at 10:14 PM, Marcelo Vanzin van...@cloudera.com
wrote:
Hello there,
On Fri, May 30, 2014
26 matches
Mail list logo