Github user gdfm commented on the pull request:

    https://github.com/apache/incubator-samoa/pull/24#issuecomment-97352811
  
    Unfortunately the patch doesn't seem to solve the issue.
    First, it would be good to write a test to isolate the issue.
    By debugging the example in the Jira, I get the following metadata in 
Instances:
    ```
    [@attribute Dur NUMERIC, @attribute Proto NUMERIC, @attribute Dir NUMERIC, 
@attribute State NUMERIC, @attribute sTos NUMERIC, @attribute dTos NUMERIC, 
@attribute TotPkts NUMERIC, @attribute TotBytes NUMERIC, @attribute SrcBytes 
NUMERIC, @attribute class NUMERIC]
    ```
    There are several issues:
    1) There is an issue with the toString method in Attribute (it always 
returns NUMERIC)
    2) The ArffLoader does not handle well newlines before the definition of 
the set of values for a nominal attribute. By putting the attribute definition 
on a single line I managed to get the header parsed correctly.
    ```
    [@attribute Dur NUMERIC, @attribute Proto NOMINAL, @attribute Dir NOMINAL, 
@attribute State NOMINAL, @attribute sTos NUMERIC, @attribute dTos NUMERIC, 
@attribute TotPkts NUMERIC, @attribute TotBytes NUMERIC, @attribute SrcBytes 
NUMERIC, @attribute class NOMINAL]
    ```
    3) The characters '<->' and so on are not recognized as words by line 97 in 
ArffLoader
    ```
    else if (streamTokenizer.sval != null && (streamTokenizer.ttype == 
StreamTokenizer.TT_WORD
                  || streamTokenizer.ttype == 34)) {
    ```
    The ttype value is 39, which corresponds to the single quote '.
    By modifying the test arff file I managed to make it work.
    I guess we could add an OR to the statement checking also for single quotes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to