Github user ygcao commented on the pull request:

    https://github.com/apache/spark/pull/10152#issuecomment-163517944
  
    To see is to believe, comparison is the key. You are encouraged to use my 
version(using a simple sentence splitter by dot and question mark. Btw:if your 
data is not text, I want to say Any sequence data has its natural boundary just 
like sentence.e.g user session's natural boundary is time span of continuous 
operations), and the old version to build models from the same set of text/data 
set and then compare them to see differences.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to