Re: please care and vote for Chinese people under cruel autocracy of CCP, great thanks!
Please do not send spam email. Thanks. On Thu, 29 Aug 2019, 13:05 ant_fighter, wrote: > Hi all, > Sorry for disturbing you guys. Though I don't think here as a proper place > to do this, I need your help, your vote, your holy vote, for us Chinese, > for conscience and justice, for better world. > > In the over 70 years of ruling over China, the Chinese Communist Party has > done many horrible things humans can think of. These malicious and evil > deeds include but are not limited to: falsifying national history, > suppression of freedom of speech and press, money laundering in the scale > of trillions, live organ harvesting, sexual harassment and assault to > underaged females, slaughtering innocent citizens with > counter-revolutionary excuses, etc. > > In light of the recent violent actions to Hong Kongers by the People's > Liberation Army (PLA) disguised as Hong Kong Police Force, we the people > petition to officially recognize the Chinese Communist Party as a terrorist > organization. > PLEASE SIGNUP and VOTE for us: > > https://petitions.whitehouse.gov/petition/call-official-recognition-chinese-communist-party-terrorist-organization > > Thanks again for all! > > nameless, an ant fighter > 2019.8.29 >
Re: [ML] Migrating transformers from mllib to ml
Hi, I have migrated HashingTF from mllib to ml, and wait for review. see: [SPARK-21748][ML] Migrate the implementation of HashingTF from MLlib to ML #18998 https://github.com/apache/spark/pull/18998 On Mon, Nov 6, 2017 at 10:58 PM, Marco Gaidowrote: > Hello, > > I saw that there are several TODOs to migrate some transformers (like > HashingTF and IDF) to use only ml.Vector in order to avoid the overhead of > converting them to the mllib ones and back. > > Is there any reason why this has not been done so far? Is it to avoid code > duplication? If so, is it still an issue since we are going to deprecate > mllib from 2.3 (at least this is what I read on Spark docs)? If no, I can > work on this. > > Thanks, > Marco > > >
Re: LibSVM should have just one input file
Hi, yaphet. It seems that the code you pasted should be located in LibSVM, rather than SVM. Do I misunderstand? For LibSVMDataSource, 1. if numFeatures is unspecified, only one file is valid input. val df = spark.read.format("libsvm") .load("data/mllib/sample_libsvm_data.txt") 2. otherwise, multiple files are OK. val df = spark.read.format("libsvm") .option("numFeatures", "780") .load("data/mllib/sample_libsvm_data.txt") For more to see: http://spark.apache.org/docs/latest/api/scala/index.html# org.apache.spark.ml.source.libsvm.LibSVMDataSource On Mon, Jun 12, 2017 at 11:46 AM, darion.yaphetwrote: > Hi team : > > Currently when we using SVM to train dataset we found the input > files limit only one . > > the source code as following : > > val path = if (dataFiles.length == 1) { > dataFiles.head.getPath.toUri.toString > } else if (dataFiles.isEmpty) { > throw new IOException("No input path specified for libsvm data") > } else { > throw new IOException("Multiple input paths are not supported for libsvm > data.") > } > > The file store on the Distributed File System such as HDFS is split into > mutil piece and I think this limit is not necessary . I'm not sure is it a > bug ? or something I'm using not correctly . > > thanks a lot ~~~ > > > >
Re: Starter tasks to start contributing
Hi, I think that starter label is useful for you. How about this link: https://issues.apache.org/jira/browse/SPARK-5?jql= project%20=%20SPARK%20%20AND%20component%20in%20%20("Spark% 20Core",%20%20"Structured%20Streaming")%20AND%20status% 20=%20Open%20AND%20labels%20=%20starter%20ORDER%20BY%20priority%20DESC On Wed, May 17, 2017 at 4:29 PM, vys6fudlwrote: > Hi! > > I would like to contribute to Spark since I use it at work. Are there some > starter tasks related to Spark Core or Spark Streaming that I could work > on? > I couldn't find the right search in JIRA, so if someone could even point me > to that if there is already stuff tagged there, then that would be useful > as > well. The Contributing to Spark page mentioned JIRA starter tasks, but I > couldn't find any. > > Thanks! > > > > > > -- > View this message in context: http://apache-spark-developers > -list.1001551.n3.nabble.com/Starter-tasks-to-start- > contributing-tp21570.html > Sent from the Apache Spark Developers List mailing list archive at > Nabble.com. > > - > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > >
Re: how to retain part of the features in LogisticRegressionModel (spark2.0)
Hi, jinhong. Do you use `setRegParam`, which is 0.0 by default ? Both elasticNetParam and regParam are required if regularization is need. val regParamL1 = $(elasticNetParam) * $(regParam) val regParamL2 = (1.0 - $(elasticNetParam)) * $(regParam) On Mon, Mar 20, 2017 at 6:31 PM, Yanbo Liangwrote: > Do you want to get sparse model that most of the coefficients are zeros? > If yes, using L1 regularization leads to sparsity. But the > LogisticRegressionModel coefficients vector's size is still equal with the > number of features, you can get the non-zero elements manually. Actually, > it would be a sparse vector (or matrix for multinomial case) if it's sparse > enough. > > Thanks > Yanbo > > On Sun, Mar 19, 2017 at 5:02 AM, Dhanesh Padmanabhan < > dhanesh12...@gmail.com> wrote: > >> It shouldn't be difficult to convert the coefficients to a sparse vector. >> Not sure if that is what you are looking for >> >> -Dhanesh >> >> On Sun, Mar 19, 2017 at 5:02 PM jinhong lu wrote: >> >> Thanks Dhanesh, and how about the features question? >> >> 在 2017年3月19日,19:08,Dhanesh Padmanabhan 写道: >> >> Dhanesh >> >> >> Thanks, >> lujinhong >> >> -- >> Dhanesh >> +91-9741125245 <+91%2097411%2025245> >> > >