Re: Swarming issue

Ryan J. McCall Sun, 01 Nov 2015 17:57:41 -0800

Aha, I found the issue. The child process (running HypersearchWorker.py)
was picking up python2.6, which is installed on the machine. There is a
hard-coded command line statement containing "python" in the
permutations_runner.py code and when I switched it to "python2.7" it works.
Here's the line I changed in the current code:


https://github.com/numenta/nupic/blob/master/src/nupic/swarming/permutations_runner.py#L676

Is there is a standard way of telling a linux machine which python to use?
I suppose that would be the best solution. I had made an alias in my bashrc
to set "python" to version 2.7 but clearly that must not apply to
subprocesses. If you can't specify this then it seems we want the "python"
to be configurable, or detectable from the system.

On Sun, Nov 1, 2015 at 2:30 PM, Richard Crowder <[email protected]> wrote:

> "linux2" looks fine for the handlers, where they use startswith("linux").
> So not likely to be that. Only other think I needed to do was to delete
> swarming files generated.
> So out of ideas of how I could get it to work on Windows, and you not :(
> Unless it's something with different bindings versions or some other
> Python package. Locally I have nupic 0.3.6.dev0 and nupic.bindings 0.2.2
> and a variety of other Python packages.
>
> Does "import os; print os.pathsep" print a colon? I'm imagining it does..
> Will try a Ubuntu VM though.
>
>
> On Sun, Nov 1, 2015 at 10:08 PM, Ryan J. McCall <[email protected]>
> wrote:
>
>> Hi Richard,
>>
>> Thanks for the reply. I'm not sure what I might change regarding the log
>> handlers. (I see that there is a default logging conf file that I can
>> override in my NTA_CONF_PATH.) In my script I'm able to say:
>>
>> from nupic.support import initLogging
>> initLogging()
>>
>> and I see a difference in the messages logged to console.
>>
>> The swarm-generated files don't seem to be the problem.
>>
>> "import sys; print sys.platform.lower()" gives "linux2"
>>
>> Best,
>>
>> Ryan
>>
>> On Sun, Nov 1, 2015 at 3:19 AM, Richard Crowder <[email protected]> wrote:
>>
>>> Hi Ryan,
>>>
>>> I've just updated my nupic.core and nupic forks with latest from Numenta
>>> master. And faced the exact same problem (but on Windows). I needed to do
>>> two things. Updating sys and file log handlers to support win32
>>> (src\nupic\support\__init__.py) and to delete files generated during the
>>> run of the 'simple' swarming test (with one worker, i.e. no --maxWorkers on
>>> command line). Those changes MAY only be related to the Windows porting,
>>> but a few things to try..
>>>
>>> See what the Python commands "import sys; print sys.platform.lower()"
>>> outputs.
>>> Cleaning up files generated by the swarming (for me those files where
>>> description.py,permutations.py, model_0/ directory, a .pkl and.csv file)
>>> Using --overwrite flag when swarming with the scripts\run_scripts.py
>>>
>>> I'd be interested to see the sys.platform output.
>>>
>>> Regards, Richard.
>>>
>>>
>>> On Sun, Nov 1, 2015 at 1:02 AM, Ryan J. McCall <[email protected]>
>>> wrote:
>>>
>>>> Hello NuPIC,
>>>>
>>>> I'm having an issue with swarming on a RHEL box. I've installed NuPIC
>>>> Version: 0.3.1. I have mysql running and have confirmed that db connections
>>>> can be made with the test_db.py script. The error I'm getting is similar to
>>>> some other threads (traceback below). The hypersearch finishes quickly,
>>>> evaluates 0 models and throws and exception because there's no result to
>>>> load. I would appreciate any suggestions. It looks like jobs are added to
>>>> the DB based on my debugging. My thought is to debug the HypersearchWorkers
>>>> next which run as separate processes -- have to figure out how to do 
>>>> that...
>>>>
>>>> Many thanks,
>>>>
>>>> Ryan
>>>>
>>>>
>>>> Successfully submitted new HyperSearch job, jobID=1020
>>>> Evaluated 0 models
>>>> HyperSearch finished!
>>>> Worker completion message: None
>>>>
>>>> Results from all experiments:
>>>> ----------------------------------------------------------------
>>>> Generating experiment files in directory: /tmp/tmp0y39RS...
>>>> Writing  313 lines...
>>>> Writing  114 lines...
>>>> done.
>>>> None
>>>> json.loads(jobInfo.results) raised an exception.  Here is some info to
>>>> help with debugging:
>>>> jobInfo:  _jobInfoNamedTuple(jobId=1020, client=u'GRP', clientInfo=u'',
>>>> clientKey=u'', cmdLine=u'$HYPERSEARCH', params=u'{"hsVersion": "v2",
>>>> "maxModels": null, "persistentJobGUID":
>>>> "1a3c7950-8032-11e5-8a23-a0d3c1f9d4f4", "useTerminators": false,
>>>> "description": {"includedFields": [{"fieldName": "time", "fieldType":
>>>> "datetime"}, {"maxValue": 50000, "fieldName": "volume", "fieldType": "int",
>>>> "minValue": 0}], "streamDef": {"info": "rp3_volume", "version": 1,
>>>> "streams": [{"info": "rp3_volume", "source":
>>>> "file:///home/rmccall/experiment/projects/rp3/rp3-training_data.csv",
>>>> "columns": ["*"]}]}, "inferenceType": "TemporalAnomaly", "inferenceArgs":
>>>> {"predictionSteps": [1], "predictedField": "volume"}, "iterationCount": -1,
>>>> "swarmSize": "small"}}',
>>>> jobHash='\x1a<\x81R\x802\x11\xe5\x8a#\xa0\xd3\xc1\xf9\xd4\xf4',
>>>> status=u'notStarted', completionReason=None, completionMsg=None,
>>>> workerCompletionReason=u'success', workerCompletionMsg=None, cancel=0,
>>>> startTime=None, endTime=None, results=None, engJobType=u'hypersearch',
>>>> minimumWorkers=1, maximumWorkers=8, priority=0, engAllocateNewWorkers=1,
>>>> engUntendedDeadWorkers=0, numFailedWorkers=0,
>>>> lastFailedWorkerErrorMsg=None, engCleaningStatus=u'notdone',
>>>> genBaseDescription=None, genPermutations=None,
>>>> engLastUpdateTime=datetime.datetime(2015, 11, 1, 0, 47, 18),
>>>> engCjmConnId=None, engWorkerState=None, engStatus=None,
>>>> engModelMilestones=None)
>>>> jobInfo.results:  None
>>>> EXCEPTION:  expected string or buffer
>>>> Traceback (most recent call last):
>>>>   File "/usr/local/lib/python2.7/pdb.py", line 1314, in main
>>>>     pdb._runscript(mainpyfile)
>>>>   File "/usr/local/lib/python2.7/pdb.py", line 1233, in _runscript
>>>>     self.run(statement)
>>>>   File "/usr/local/lib/python2.7/bdb.py", line 400, in run
>>>>     exec cmd in globals, locals
>>>>   File "<string>", line 1, in <module>
>>>>   File "htmAnomalyDetection.py", line 2, in <module>
>>>>     import argparse
>>>>   File "htmAnomalyDetection.py", line 314, in main
>>>>     runSwarming(args.nupicDataPath, args.projectName, args.maxWorkers,
>>>> args.overwrite)
>>>>   File "htmAnomalyDetection.py", line 164, in runSwarming
>>>>     "overwrite": overwrite})
>>>>   File
>>>> "/usr/local/lib/python2.7/site-packages/nupic/swarming/permutations_runner.py",
>>>> line 277, in runWithConfig
>>>>     return _runAction(runOptions)
>>>>   File
>>>> "/usr/local/lib/python2.7/site-packages/nupic/swarming/permutations_runner.py",
>>>> line 218, in _runAction
>>>>     returnValue = _runHyperSearch(runOptions)
>>>>   File
>>>> "/usr/local/lib/python2.7/site-packages/nupic/swarming/permutations_runner.py",
>>>> line 161, in _runHyperSearch
>>>>     metricsKeys=search.getDiscoveredMetricsKeys())
>>>>   File
>>>> "/usr/local/lib/python2.7/site-packages/nupic/swarming/permutations_runner.py",
>>>> line 826, in generateReport
>>>>     results = json.loads(jobInfo.results)
>>>>   File
>>>> "/usr/local/lib/python2.7/site-packages/nupic/swarming/object_json.py",
>>>> line 163, in loads
>>>>     json.loads(s, object_hook=objectDecoderHook, **kwargs))
>>>>   File "/usr/local/lib/python2.7/json/__init__.py", line 351, in loads
>>>>     return cls(encoding=encoding, **kw).decode(s)
>>>>   File "/usr/local/lib/python2.7/json/decoder.py", line 366, in decode
>>>>     obj, end = self.raw_decode(s, idx=_w(s, 0).end())
>>>> TypeError: expected string or buffer
>>>>
>>>> --
>>>> Ryan J. McCall
>>>> ryanjmccall.com
>>>>
>>>
>>>
>>
>>
>> --
>> Ryan J. McCall
>> ryanjmccall.com
>>
>
>


-- 
Ryan J. McCall
ryanjmccall.com

Re: Swarming issue

Reply via email to