[
https://issues.apache.org/jira/browse/SPARK-23105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16335526#comment-16335526
]
Nick Pentreath commented on SPARK-23105:
Certain of the ML QA sub-tasks are marked {{Blocker
[
https://issues.apache.org/jira/browse/SPARK-13964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16334599#comment-16334599
]
Nick Pentreath commented on SPARK-13964:
Yes, that's certainly something I'd like to see added
At least one of their comparisons is flawed.
The Spark ML version of linear regression (*note* they use linear
regression and not logistic regression, it is not clear why) uses L-BFGS as
the solver, not SGD (as MLLIB uses). Hence it is typically going to be
slower. However, it should in most
[
https://issues.apache.org/jira/browse/SPARK-23154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16332252#comment-16332252
]
Nick Pentreath commented on SPARK-23154:
SGTM
> Document backwards compatibility guarant
SVMWithSGD sits in the older "mllib" package and is not compatible directly
with the DataFrame API. I suppose one could write a ML-API wrapper around
it.
However, there is LinearSVC in Spark 2.2.x:
http://spark.apache.org/docs/latest/ml-classification-regression.html#linear-support-vector-machine
[
https://issues.apache.org/jira/browse/SPARK-23048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Pentreath reassigned SPARK-23048:
--
Assignee: Liang-Chi Hsieh
> Update mllib docs to replace OneHotEnco
[
https://issues.apache.org/jira/browse/SPARK-23048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Pentreath resolved SPARK-23048.
Resolution: Fixed
Fix Version/s: 2.3.0
Issue resolved by pull request 20257
[https
[
https://issues.apache.org/jira/browse/SPARK-23127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Pentreath resolved SPARK-23127.
Resolution: Fixed
Fix Version/s: 2.3.0
Issue resolved by pull request 20293
[https
[
https://issues.apache.org/jira/browse/SPARK-23127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Pentreath reassigned SPARK-23127:
--
Assignee: Nick Pentreath
> Update FeatureHasher user guide for catCols parame
[
https://issues.apache.org/jira/browse/SPARK-23127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Pentreath updated SPARK-23127:
---
Description: SPARK-22801 added the {{categoricalCols}} parameter and
updated the Scala
Nick Pentreath created SPARK-23127:
--
Summary: Update FeatureHasher user guide for catCols parameter
Key: SPARK-23127
URL: https://issues.apache.org/jira/browse/SPARK-23127
Project: Spark
[
https://issues.apache.org/jira/browse/SPARK-23060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326866#comment-16326866
]
Nick Pentreath commented on SPARK-23060:
I agree I don't see enough of a compelling case
[
https://issues.apache.org/jira/browse/SPARK-21108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Pentreath resolved SPARK-21108.
Resolution: Fixed
> convert LinearSVC to aggregator framew
[
https://issues.apache.org/jira/browse/SPARK-21856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Pentreath reassigned SPARK-21856:
--
Assignee: Chunsheng Ji
> Update Python API for MultilayerPerceptronClassifierMo
[
https://issues.apache.org/jira/browse/SPARK-21856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Pentreath reassigned SPARK-21856:
--
Assignee: (was: Weichen Xu)
> Update Python
[
https://issues.apache.org/jira/browse/SPARK-21856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Pentreath reassigned SPARK-21856:
--
Assignee: Weichen Xu
> Update Python API for MultilayerPerceptronClassifierMo
[
https://issues.apache.org/jira/browse/SPARK-21856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Pentreath resolved SPARK-21856.
Resolution: Fixed
> Update Python API for MultilayerPerceptronClassifierMo
[
https://issues.apache.org/jira/browse/SPARK-22943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16326151#comment-16326151
]
Nick Pentreath commented on SPARK-22943:
Does the new estimator & model version of OHE s
[
https://issues.apache.org/jira/browse/SPARK-22993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Pentreath resolved SPARK-22993.
Resolution: Fixed
> checkpointInterval param doc should be clea
[
https://issues.apache.org/jira/browse/SPARK-22993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Pentreath reassigned SPARK-22993:
--
Assignee: Seth Hendrickson
> checkpointInterval param doc should be clea
[
https://issues.apache.org/jira/browse/SPARK-22871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16307210#comment-16307210
]
Nick Pentreath commented on SPARK-22871:
Tree-based feature transformation is covered in SPARK
[
https://issues.apache.org/jira/browse/SPARK-22801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Pentreath resolved SPARK-22801.
Resolution: Fixed
Fix Version/s: 2.3.0
Issue resolved by pull request 19991
[https
[
https://issues.apache.org/jira/browse/SPARK-22397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Pentreath reassigned SPARK-22397:
--
Assignee: Huaxin Gao
> Add multiple column support to QuantileDiscreti
[
https://issues.apache.org/jira/browse/SPARK-22397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Pentreath resolved SPARK-22397.
Resolution: Fixed
Fix Version/s: 2.3.0
Issue resolved by pull request 19715
[https
[
https://issues.apache.org/jira/browse/SPARK-22799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Pentreath updated SPARK-22799:
---
Description: See the related discussion:
https://issues.apache.org/jira/browse/SPARK-8418
Nick Pentreath created SPARK-22801:
--
Summary: Allow FeatureHasher to specify numeric columns to treat
as categorical
Key: SPARK-22801
URL: https://issues.apache.org/jira/browse/SPARK-22801
Project
[
https://issues.apache.org/jira/browse/SPARK-8418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16292320#comment-16292320
]
Nick Pentreath edited comment on SPARK-8418 at 12/15/17 10:40 AM
[
https://issues.apache.org/jira/browse/SPARK-22799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Pentreath updated SPARK-22799:
---
Issue Type: Improvement (was: New Feature)
> Bucketizer should throw exception if sin
[
https://issues.apache.org/jira/browse/SPARK-8418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16292320#comment-16292320
]
Nick Pentreath commented on SPARK-8418:
---
Created SPARK-22796, SPARK-22797 and SPARK-22798 to track
Nick Pentreath created SPARK-22799:
--
Summary: Bucketizer should throw exception if single- and
multi-column params are both set
Key: SPARK-22799
URL: https://issues.apache.org/jira/browse/SPARK-22799
Nick Pentreath created SPARK-22798:
--
Summary: Add multiple column support to PySpark StringIndexer
Key: SPARK-22798
URL: https://issues.apache.org/jira/browse/SPARK-22798
Project: Spark
Nick Pentreath created SPARK-22797:
--
Summary: Add multiple column support to PySpark Bucketizer
Key: SPARK-22797
URL: https://issues.apache.org/jira/browse/SPARK-22797
Project: Spark
Issue
Nick Pentreath created SPARK-22796:
--
Summary: Add multiple column support to PySpark QuantileDiscretizer
Key: SPARK-22796
URL: https://issues.apache.org/jira/browse/SPARK-22796
Project: Spark
[
https://issues.apache.org/jira/browse/SPARK-19357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16288993#comment-16288993
]
Nick Pentreath commented on SPARK-19357:
I've thought about this and taken a look at the proposed
[
https://issues.apache.org/jira/browse/SPARK-22700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Pentreath resolved SPARK-22700.
Resolution: Fixed
Fix Version/s: 2.3.0
Issue resolved by pull request 19894
[https
[
https://issues.apache.org/jira/browse/SPARK-22700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Pentreath reassigned SPARK-22700:
--
Assignee: zhengruifeng
> Bucketizer.transform incorrectly drops row containing
[
https://issues.apache.org/jira/browse/SPARK-22690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Pentreath resolved SPARK-22690.
Resolution: Fixed
> Imputer inherit HasOutputC
[
https://issues.apache.org/jira/browse/SPARK-22690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Pentreath updated SPARK-22690:
---
Fix Version/s: 2.3.0
> Imputer inherit HasOutputC
[
https://issues.apache.org/jira/browse/SPARK-22690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Pentreath reassigned SPARK-22690:
--
Assignee: zhengruifeng
> Imputer inherit HasOutputC
[
https://issues.apache.org/jira/browse/SPARK-8418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16275426#comment-16275426
]
Nick Pentreath commented on SPARK-8418:
---
*1 I’m ok with throwing an exception. We can update
Hi Tomasz
Parallel evaluation for CrossValidation and TrainValidationSplit was added
for Spark 2.3 in https://issues.apache.org/jira/browse/SPARK-19357
On Wed, 29 Nov 2017 at 16:31 Tomasz Dudek
wrote:
> Hey,
>
> is there a way to make the following code:
>
> val
For that package specifically it’s best to see if they have a mailing list
and if not perhaps ask on github issues.
Having said that perhaps the folks involved in that package will reply here
too.
On Wed, 22 Nov 2017 at 20:03, Andy Davidson
wrote:
> I am starting
[
https://issues.apache.org/jira/browse/SPARK-20199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Pentreath reassigned SPARK-20199:
--
Assignee: pralabhkumar
> GradientBoostedTreesModel doesn't h
[
https://issues.apache.org/jira/browse/SPARK-20199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Pentreath resolved SPARK-20199.
Resolution: Fixed
Fix Version/s: 2.3.0
Issue resolved by pull request 18118
[https
+1 I think that’s practical
On Fri, 10 Nov 2017 at 03:13, Erik Erlandson wrote:
> +1 on extending the deadline. It will significantly improve the logistics
> for upstreaming the Kubernetes back-end. Also agreed, on the general
> realities of reduced bandwidth over the
[
https://issues.apache.org/jira/browse/SPARK-20542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Pentreath resolved SPARK-20542.
Resolution: Fixed
Fix Version/s: 2.3.0
> Add an API into Bucketizer that can
[
https://issues.apache.org/jira/browse/SPARK-13030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16226448#comment-16226448
]
Nick Pentreath commented on SPARK-13030:
I just think it makes sense for OHE to be an Estimator
For now, you must follow this approach of constructing a pipeline
consisting of a StringIndexer for each categorical column. See
https://issues.apache.org/jira/browse/SPARK-11215 for the related JIRA to
allow multiple columns for StringIndexer, which is being worked on
currently.
The reason
[
https://issues.apache.org/jira/browse/SPARK-22397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Pentreath updated SPARK-22397:
---
Description: Once SPARK-20542 adds multi column support to {{Bucketizer}},
we can add multi
[
https://issues.apache.org/jira/browse/SPARK-22397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16224459#comment-16224459
]
Nick Pentreath commented on SPARK-22397:
[~huaxing] is working on this and will submit a PR
Nick Pentreath created SPARK-22397:
--
Summary: Add multiple column support to QuantileDiscretizer
Key: SPARK-22397
URL: https://issues.apache.org/jira/browse/SPARK-22397
Project: Spark
Issue
[
https://issues.apache.org/jira/browse/SPARK-8418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16224454#comment-16224454
]
Nick Pentreath commented on SPARK-8418:
---
Adding SPARK-13030, since the new version
[
https://issues.apache.org/jira/browse/SPARK-22346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16219006#comment-16219006
]
Nick Pentreath commented on SPARK-22346:
SPARK-19141 mentions another option which may work
[
https://issues.apache.org/jira/browse/SPARK-22331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16214747#comment-16214747
]
Nick Pentreath commented on SPARK-22331:
I can't think of any examples offhand where case
[
https://issues.apache.org/jira/browse/SPARK-22289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16207133#comment-16207133
]
Nick Pentreath commented on SPARK-22289:
I think option (2) is the more general fix here
[
https://issues.apache.org/jira/browse/SPARK-20542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Pentreath reassigned SPARK-20542:
--
Assignee: Liang-Chi Hsieh
> Add an API into Bucketizer that can bin a lot of colu
[
https://issues.apache.org/jira/browse/SPARK-10802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16196673#comment-16196673
]
Nick Pentreath commented on SPARK-10802:
SPARK-20679 has been completed for the new ML API. I've
[
https://issues.apache.org/jira/browse/SPARK-10802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Pentreath resolved SPARK-10802.
Resolution: Won't Fix
> Let ALS recommend for subset of d
[
https://issues.apache.org/jira/browse/SPARK-20679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Pentreath reassigned SPARK-20679:
--
Assignee: Nick Pentreath
> Let ML ALS recommend for a subset of users/it
[
https://issues.apache.org/jira/browse/SPARK-20679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Pentreath resolved SPARK-20679.
Resolution: Fixed
Fix Version/s: 2.3.0
Issue resolved by pull request 18748
[https
[
https://issues.apache.org/jira/browse/SPARK-22115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16196288#comment-16196288
]
Nick Pentreath commented on SPARK-22115:
Best keep it private for now.
There's been lot
eases/spark-release-2-2-0.html#known-issues
> before due to this reason.
> I believe It should be fine and probably we should note if possible. I
> believe this should not be a regression anyway as, if I understood
> correctly, it was there from the very first place.
>
> Thank
Checked sigs & hashes.
Tested on RHEL
build/mvn -Phadoop-2.7 -Phive -Pyarn test passed
Python tests passed
I ran R tests and am getting some failures:
https://gist.github.com/MLnick/ddf4d531d5125208771beee0cc9c697e (I seem to
recall similar issues on a previous release but I thought it was
Ah right! Was using a new cloud instance and didn't realize I was logged in
as root! thanks
On Tue, 3 Oct 2017 at 21:13 Marcelo Vanzin <van...@cloudera.com> wrote:
> Maybe you're running as root (or the admin account on your OS)?
>
> On Tue, Oct 3, 2017 at 12:12 PM, Nick Pentreath
Hmm I'm consistently getting this error in core tests:
- SPARK-3697: ignore directories that cannot be read. *** FAILED ***
2 was not equal to 1 (FsHistoryProviderSuite.scala:146)
Anyone else? Any insight? Perhaps it's my set up.
>>
>> On Tue, Oct 3, 2017 at 7:24 AM Holden Karau
[
https://issues.apache.org/jira/browse/SPARK-22115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16187740#comment-16187740
]
Nick Pentreath commented on SPARK-22115:
Do we plan to make this private? Or are you suggesting
I'd agree with #1 or #2. Deprecation now seems fine.
Perhaps this should be raised on the user list also?
And perhaps it makes sense to look at moving the Flume support into Apache
Bahir if there is interest (I've cc'ed Bahir dev list here)? That way the
current state of the connector could keep
I'd agree with #1 or #2. Deprecation now seems fine.
Perhaps this should be raised on the user list also?
And perhaps it makes sense to look at moving the Flume support into Apache
Bahir if there is interest (I've cc'ed Bahir dev list here)? That way the
current state of the connector could keep
Congratulations!
>>
>> Matei Zaharia wrote
>> > Hi all,
>> >
>> > The Spark PMC recently added Tejas Patil as a committer on the
>> > project. Tejas has been contributing across several areas of Spark for
>> > a while, focusing especially on scalability issues and SQL. Please
>> > join me in
MLlib currently doesn't support CBOW - there is an open PR for it (see
https://issues.apache.org/jira/browse/SPARK-20372).
On Thu, 28 Sep 2017 at 09:56 pun wrote:
> Hello,
> My understanding is that word2vec can be ran in two modes:
>
>- continuous bag-of-words
[
https://issues.apache.org/jira/browse/SPARK-13030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16180073#comment-16180073
]
Nick Pentreath commented on SPARK-13030:
Yes definitely needs to support multi column. [~viirya
[
https://issues.apache.org/jira/browse/SPARK-13030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16177939#comment-16177939
]
Nick Pentreath commented on SPARK-13030:
It's ugly but we can introduce a new class
[
https://issues.apache.org/jira/browse/SPARK-22061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Pentreath resolved SPARK-22061.
Resolution: Won't Fix
> Add pipeline model of
[
https://issues.apache.org/jira/browse/SPARK-22061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16175124#comment-16175124
]
Nick Pentreath commented on SPARK-22061:
Agreed, this already exists. I closed this issue.
>
[
https://issues.apache.org/jira/browse/SPARK-21958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Pentreath reassigned SPARK-21958:
--
Assignee: Travis Hegner
> Attempting to save large Word2Vec model hangs dri
[
https://issues.apache.org/jira/browse/SPARK-21958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Pentreath resolved SPARK-21958.
Resolution: Fixed
Fix Version/s: 2.3.0
Issue resolved by pull request 19191
[https
[
https://issues.apache.org/jira/browse/SPARK-22021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16167806#comment-16167806
]
Nick Pentreath commented on SPARK-22021:
Why a JavaScript function? I think this is not a good
[
https://issues.apache.org/jira/browse/SPARK-21958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16160862#comment-16160862
]
Nick Pentreath commented on SPARK-21958:
Seems like your proposal could improve things - but yeah
[
https://issues.apache.org/jira/browse/SPARK-19357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Pentreath reassigned SPARK-19357:
--
Assignee: Bryan Cutler
> Parallel Model Evaluation for ML Tuning: Sc
[
https://issues.apache.org/jira/browse/SPARK-19357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Pentreath resolved SPARK-19357.
Resolution: Fixed
Fix Version/s: 2.3.0
Issue resolved by pull request 16774
[https
[
https://issues.apache.org/jira/browse/SPARK-21926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16154901#comment-16154901
]
Nick Pentreath edited comment on SPARK-21926 at 9/6/17 6:54 AM:
For #2
[
https://issues.apache.org/jira/browse/SPARK-21926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16154901#comment-16154901
]
Nick Pentreath commented on SPARK-21926:
For #2, (a) is definitely the correct solution.
> S
[
https://issues.apache.org/jira/browse/SPARK-15790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Pentreath resolved SPARK-15790.
Resolution: Fixed
> Audit @Since annotations in
t;
> On Fri, Sep 1, 2017 at 11:46 AM, Nick Pentreath <nick.pentre...@gmail.com>
> wrote:
>
>> Dataset does have storageLevel. So you can use isCached = (storageLevel
>> != StorageLevel.NONE) as a test.
>>
>> Arguably isCached could be added to dataset too, sh
Dataset does have storageLevel. So you can use isCached = (storageLevel !=
StorageLevel.NONE) as a test.
Arguably isCached could be added to dataset too, shouldn't be a
controversial change.
On Fri, 1 Sep 2017 at 17:31, Nathan Kronenfeld
wrote:
> I'm currently
MLlib has tried quite hard to ensure the migration guide is up to date for
each release. I think generally we catch all breaking and most major
behavior changes
On Wed, 30 Aug 2017 at 17:02, Dongjoon Hyun wrote:
> +1
>
> On Wed, Aug 30, 2017 at 7:54 AM, Xiao Li
MLlib has tried quite hard to ensure the migration guide is up to date for
each release. I think generally we catch all breaking and most major
behavior changes
On Wed, 30 Aug 2017 at 17:02, Dongjoon Hyun wrote:
> +1
>
> On Wed, Aug 30, 2017 at 7:54 AM, Xiao Li
[
https://issues.apache.org/jira/browse/SPARK-21469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Pentreath resolved SPARK-21469.
Resolution: Fixed
Fix Version/s: 2.3.0
Issue resolved by pull request 19024
[https
[
https://issues.apache.org/jira/browse/SPARK-21469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Pentreath reassigned SPARK-21469:
--
Assignee: Bryan Cutler
> Add doc and example for FeatureHas
[
https://issues.apache.org/jira/browse/SPARK-21086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138007#comment-16138007
]
Nick Pentreath commented on SPARK-21086:
Ok - I commented on the PR.
Agree it makes sense
[
https://issues.apache.org/jira/browse/SPARK-21799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136630#comment-16136630
]
Nick Pentreath commented on SPARK-21799:
Refer to SPARK-18608 and SPARK-19422. There is some work
[
https://issues.apache.org/jira/browse/SPARK-21468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Pentreath reassigned SPARK-21468:
--
Assignee: Nick Pentreath
> FeatureHasher Python
[
https://issues.apache.org/jira/browse/SPARK-21468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Pentreath resolved SPARK-21468.
Resolution: Fixed
Fix Version/s: 2.3.0
Issue resolved by pull request 18970
[https
[
https://issues.apache.org/jira/browse/SPARK-4981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16132118#comment-16132118
]
Nick Pentreath commented on SPARK-4981:
---
Hey folks, as interesting as this would be, I think it's
[
https://issues.apache.org/jira/browse/SPARK-21742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16128512#comment-16128512
]
Nick Pentreath commented on SPARK-21742:
Isn't the solution to set a fixed seed for the randomly
[
https://issues.apache.org/jira/browse/SPARK-13969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Pentreath resolved SPARK-13969.
Resolution: Fixed
Fix Version/s: 2.3.0
Issue resolved by pull request 18513
[https
[
https://issues.apache.org/jira/browse/SPARK-13969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nick Pentreath reassigned SPARK-13969:
--
Assignee: Nick Pentreath
> Extend input format that feature hashing can han
[
https://issues.apache.org/jira/browse/SPARK-21723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16126939#comment-16126939
]
Nick Pentreath commented on SPARK-21723:
Yes, we should definitely be able to write LibSVM format
[
https://issues.apache.org/jira/browse/SPARK-21086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16112623#comment-16112623
]
Nick Pentreath commented on SPARK-21086:
I just want to understand _why_ folks want to keep all
[
https://issues.apache.org/jira/browse/SPARK-21624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16112489#comment-16112489
]
Nick Pentreath commented on SPARK-21624:
I wonder if it makes sense to make it a {{Vector
101 - 200 of 1370 matches
Mail list logo