Hi Brian --
I believe that setting CHPL_TARGET_ARCH to 'native' should get you better
results as long as you're not cross-compiling. Alternatively, you can set
it to 'none' which will squash the warning you're getting. In any case, I
wouldn't expect the lack of --specialize optimizations to be the problem
here (but if you're throwing components of --fast manually, you'd want to
be sure to add -O in addition to --no-checks).
Generally speaking, Chapel programs compiled for --no-local (multi-locale
execution) tend to generate much worse per-node code than those compiled
for --local (single-locale execution), and this is an area of active
optimization effort. See the "Performance Optimizations and Generated
Code Improvements" release note slides at:
http://chapel.cray.com/download.html#releaseNotes
http://chapel.cray.com/releaseNotes/1.11/06-PerfGenCode.pdf
and particularly, the section entitled "the 'local field' pragma" for more
details on this effort (starts at slide 34).
In a nutshell, the Chapel compiler conservatively assumes that things are
remote rather than local when in doubt (to emphasize correctness over fast
but incorrect programs), and then gets into doubt far more often than it
should. We're currently working on tightening up this gap.
This could explain the full difference in performance that you're seeing,
or something else may be happening. One way to check into this might be
to run a --local vs. --no-local execution with CHPL_COMM=none to see how
much overhead is added. The fact that all CPUs are pegged is a good
indication that you don't have a problem with load balance or distributing
data/computation across nodes, I'd guess?
-Brad
On Mon, 11 May 2015, Brian Guarraci wrote:
> I should add that I did supply --no-checks and that helped about 10%.
>
> On Mon, May 11, 2015 at 10:04 AM, Brian Guarraci <[email protected]> wrote:
>
>> It says:
>>
>> warning: --specialize was set, but CHPL_TARGET_ARCH is 'unknown'. If you
>> want any specialization to occur please set CHPL_TARGET_ARCH to a proper
>> value.
>> It's unclear which target arch is appropriate.
>>
>> On Mon, May 11, 2015 at 9:55 AM, Brad Chamberlain <[email protected]> wrote:
>>
>>>
>>> Hi Brian --
>>>
>>> Getting --fast working should definitely be the first priority. What
>>> about it fails to work?
>>>
>>> -Brad
>>>
>>>
>>>
>>> On Sun, 10 May 2015, Brian Guarraci wrote:
>>>
>>> Hi,
>>>>
>>>> I've been testing my search index on my 16 node ARM system and have been
>>>> running into some strange behavior. The cool part is that the locale
>>>> partitioning concept seems to work well, the downside is that the system
>>>> is
>>>> very slow. I've rewritten the approach a few different ways and haven't
>>>> made a dent, so wanted to ask a few questions.
>>>>
>>>> On the ARM processors, I can only use FIFO and can't optimize (--fast
>>>> doesn't work). Is this going to significantly affect cross-locale
>>>> performance?
>>>>
>>>> I've looked at the generated C code and tried to minimize the _comm_
>>>> operations in core methods, but doesn't seem to help. Network usage is
>>>> still quite low (100K/s) while CPUs are pegged. Are there any profiling
>>>> tools I can use to understand what might be going on here?
>>>>
>>>> Generally, on my laptop or single node, I can index about 1.1MM records
>>>> in
>>>> under 10s. With 16 nodes, it takes 10min to do 100k records.
>>>>
>>>> Wondering if there's some systemic issue at play here and how can further
>>>> investigate.
>>>>
>>>> Thanks!
>>>> Brian
>>>>
>>>>
>>
>
------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Chapel-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/chapel-developers