[ 
https://issues.apache.org/jira/browse/SPARK-20203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15953299#comment-15953299
 ] 

Cyril de Vogelaere edited comment on SPARK-20203 at 4/3/17 11:18 AM:
---------------------------------------------------------------------

This cannot have performance implication, we are not changing anything but the 
default value.
It does change the number of solution we are searching for. So of course it 
will take longer since the search space is bigger.

But on a dataset where it already found everything, it should still do so. And 
not be slower at all.
Now, it would just find everything by default. Which, I agree, should be 
debated. To know whether that's really what we want the default behavior of the 
program to be.


was (Author: syrux):
This cannot have performance implication, we are not changing anything but the 
default value.
It does change the number of solution we are searching for. So of course it 
will take longer since the search space is bigger.

But on a dataset where it already found everything, it should still do so.
Now, it would just find everything by default. Which, I agree, should be 
debated. To know whether that's really what we want the default behavior of the 
program to be.

> Change default maxPatternLength value to Int.MaxValue in PrefixSpan
> -------------------------------------------------------------------
>
>                 Key: SPARK-20203
>                 URL: https://issues.apache.org/jira/browse/SPARK-20203
>             Project: Spark
>          Issue Type: Wish
>          Components: MLlib
>    Affects Versions: 2.1.0
>            Reporter: Cyril de Vogelaere
>            Priority: Trivial
>   Original Estimate: 0h
>  Remaining Estimate: 0h
>
> I think changing the default value to Int.MaxValue would be more user 
> friendly. At least for new users.
> Personally, when I run an algorithm, I expect it to find all solution by 
> default. And a limited number of them, when I set the parameters to do so.
> The current implementation limit the length of solution patterns to 10.
> Thus preventing all solution to be printed when running slightly large 
> datasets.
> I feel like that should be changed, but since this would change the default 
> behavior of PrefixSpan. I think asking for the communities opinion should 
> come first. So, what do you think ?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to