Re: [U2] [u2] Parallel processing in Universe

2012-10-03 Thread Wjhonson
A Spanner deployment is called a Universe


 

 

 

-Original Message-
From: Robert Colquhoun robert.colquh...@gmail.com
To: U2 Users List u2-users@listserver.u2ug.org
Sent: Tue, Oct 2, 2012 9:13 pm
Subject: Re: [U2] [u2] Parallel processing in Universe


On Tue, Oct 2, 2012 at 5:58 PM, Symeon Breen syme...@gmail.com wrote:

 However map reduce and hadoop are pretty horrible things. Even Google have
 moved away from it with Caffiene etc.


Going OT a little, i think Google is replacing BigTable which was part of
Caffeine in 2010 with Spanner now.  Here is a doc about it released last
month:

http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en//archive/spanner-osdi2012.pdf

...amusingly they call it a Multi-Version Database, can't wait till that
gets abbreviated.

- Robert
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

 
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] [u2] Parallel processing in Universe

2012-10-02 Thread Symeon Breen
Oracle and sql server both use map reduce internally when doing collations
and totals. However they work differently to U2 in that they have one big
process that runs queries from the clients. This process can then cache,
multithread and map reduce. U2 is differently architected in that the client
process (uv or udt process) actually do the work and the central udt
processes are fairly slim. These client processes are single threaded. To do
any multi threading/multi processing is part of the application rather than
inherent in the database. 

One option is to make u2 a hadoop supported data store, you could then
mapreduce across multiple instances using whatever hadoop supporting toolset
you wanted.

However map reduce and hadoop are pretty horrible things. Even Google have
moved away from it with Caffiene etc.



-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Wjhonson
Sent: 01 October 2012 21:05
To: u2-users@listserver.u2ug.org
Subject: [U2] [u2] Parallel processing in Universe


What's the largest dataset in the Universe user world?
In terms of number of records.

I'm wondering if we have any potential for utilities that map-reduce.
I suppose you would spawn phantoms but how do they communicate back to the
master node?
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] [u2] Parallel processing in Universe

2012-10-02 Thread Wols Lists
On 01/10/12 22:47, Robert Houben wrote:
 Create an index on a dict pointing at the first character of the key, and 
 have each phantom take two digits. (0-1, 2-3, 4-5, 6-7, 8-9)
 
Actually, this is a very BAD way of chopping up a file into five even
chunks.

I'm not sure of the stats, but on any file with sequential keys, the
first phantom will get the majority of the records, the second get the
majority of what's left, etc etc.

A lot of people make the mistake of thinking this is a good technique.
I'm not even sure it works well with random numbers...

Cheers,
Wol
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] [u2] Parallel processing in Universe

2012-10-02 Thread Wols Lists
On 02/10/12 03:49, Ross Ferris wrote:
 If the file were big enough, and already had part files, then I believe that 
 you could have a phantom process each of the individual parts. Failing that, 
 get an SSD  relatively cheap, and will give your processing a reasonable 
 kick along!!
 
Just be careful with an SSD. If you have a power-fail in the middle of
your process this sounds just like the scenario that will trash it. As
in, totally dead no recovery possible.

SSDs are great, but a power fail during write can take out the
controller. One dead, irrecoverable disk. And if you're hammering the
i/o you are VERY vulnerable.

Cheers,
Wol
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] [u2] Parallel processing in Universe

2012-10-02 Thread George Gallen
What about an striped array of SSD with a backup battery to flush the write 
buffer on power fail.
No more dangerous (IMO) than an array of hard drives - but given the limited 
write times of an SSD
That could be more of a danger, unless your using larger drives and not a lot 
of data so the drive
Has lot's of area to failover to when it reaches it's write maximum.

-Original Message-
From: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Wols Lists
Sent: Tuesday, October 02, 2012 4:20 AM
To: u2-users@listserver.u2ug.org
Subject: Re: [U2] [u2] Parallel processing in Universe

On 02/10/12 03:49, Ross Ferris wrote:
 If the file were big enough, and already had part files, then I believe that 
 you could have a phantom process each of the individual parts. Failing that, 
 get an SSD  relatively cheap, and will give your processing a reasonable 
 kick along!!
 
Just be careful with an SSD. If you have a power-fail in the middle of
your process this sounds just like the scenario that will trash it. As
in, totally dead no recovery possible.

SSDs are great, but a power fail during write can take out the
controller. One dead, irrecoverable disk. And if you're hammering the
i/o you are VERY vulnerable.

Cheers,
Wol
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] [u2] Parallel processing in Universe (Unclassified)

2012-10-02 Thread Doug Averch
Only outside of U2 using UniObjects can you achieve any type of parallel
activity. We have through UniObjects got 80 processes working from a single
Eclipse session through the use of threads in Java.

UniObjects creates individual uvapi_slave or udapi_slave for each of these
processes but the system or in this case the udapi_server or uvadpi_server
cannot handle as many threads as we would like.  We never ran out of memory
on our 8GB Windows 2008R2 Server nor did SSD 120GB drive fail to keep up
with the 80 ANALYZE.FILES or the 80 RESIZE commands we were issuing on from
our XLr8Resizer product within Eclipse.

The only way we got this working was to set the retries to 1000 on
reopening the connections.  Although that number seems high it helped and
get us from our previous best of 39 process to 80 process. When we have a
lot of time and cannot think of anything better to do we will try for 500
process.

Regards,
Doug
www.u2logic.com
Eclipse based tools for the U2 programmer




 -Original Message-
 From: u2-users-boun...@listserver.u2ug.org [mailto:
 u2-users-boun...@listserver.u2ug.org] On Behalf Of HENDERSON MIKE, MR
 Sent: Tuesday, 2 October 2012 1:18 PM
 To: U2 Users List
 Subject: Re: [U2] [u2] Parallel processing in Universe (Unclassified)

 I have often thought about this - mostly in an idle moment or as a
 displacement activity for something less amusing that I ought to be doing.
 ;-)


 First of all, Universe is already extremely parallel: there's a separate
 O/S thread for each TTY and for each phantom, and you can't get more
 parallel than that for interactive processing.

 So you want more parallelism for your batch processes.
 Different applications have different degrees of inherent parallelism.
 For example in utility billing systems there is frequently the concept of
 a group of premises - based on the old concept of a foot-borne meter reader
 with a 'book' of readings to get. Each 'book' can be processed
 independently of every other. In payroll, each employee's record can be
 processed independently. Other areas of commerce have different
 characteristics.

 I think that whatever unit of parallelism you settle for, you'd need three
 processes: a 'dispatcher' that selects records for processing and queues
 them into some structure for processing; a set of 'workers' that take
 queued work items, process them, mark them as processed and put the results
 in some common store; and a 'monitor' that looks for unprocessed records
 and indications of stuck processes, and collates the results for final
 output.
 I've seen a couple of versions of this, one for electricity billings and
 another for overnight batch-processing of report requests, both well over a
 decade ago, and neither still in use although their underlying packages are
 still being run.

 The major issue is that these days the whole entity in the general
 commercial world is far more likely to be I/O limited than CPU limited, and
 therefore introducing parallelism will be no help at all if the I/O system
 is already choked.
 Even if the system is currently CPU-limited, multi-threading may not
 produce much improvement without very careful design of the record locking
 philosophy - introducing parallelism will be no help if all the threads end
 up contending serially for one record lock or a small set of locks.


 If you want it to go faster, buy the CPU with the fastest clock you can
 get (not the one with the most cores), and put your database on SSD like
 Ross said.
 The Power7+ chips being announced any day now are rumoured to go to
 5GHz+, maybe even more if you have half the cores on the chip disabled.


 Regards


 Mike

 -Original Message-
 From: u2-users-boun...@listserver.u2ug.org
 [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Ross Ferris
 Sent: Tuesday, 2 October 2012 3:50 p.m.
 To: U2 Users List
 Subject: Re: [U2] [u2] Parallel processing in Universe

 If the file were big enough, and already had part files, then I believe
 that you could have a phantom process each of the individual parts.
 Failing that, get an SSD  relatively cheap, and will give your
 processing a reasonable kick along!!

 Ross Ferris
 Stamina Software
 Visage  Better by Design!


 -Original Message-
 From: u2-users-boun...@listserver.u2ug.org
 [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David Wolverton
 Sent: Tuesday, 2 October 2012 8:47 AM
 To: 'U2 Users List'
 Subject: Re: [U2] [u2] Parallel processing in Universe

 OK - I was trying to create a 'smoother use' of the disk and 'read ahead'
 -- this example the disk would be chattering from the heads moving all over
 the place. I was trying to find a way to make this process more 'orderly'
 -- is there one?

 -Original Message-
 From: u2-users-boun...@listserver.u2ug.org
 [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Robert Houben
 Sent: Monday, October 01, 2012 4:48 PM
 To: U2 Users List
 Subject: Re: [U2] [u2] Parallel

Re: [U2] [u2] Parallel processing in Universe

2012-10-02 Thread Wols Lists
On 02/10/12 15:28, George Gallen wrote:
 What about an striped array of SSD with a backup battery to flush the write 
 buffer on power fail.
 No more dangerous (IMO) than an array of hard drives - but given the limited 
 write times of an SSD
 That could be more of a danger, unless your using larger drives and not a lot 
 of data so the drive
 Has lot's of area to failover to when it reaches it's write maximum.

I guess a backup battery would save you. Basically, anything to prevent
power dying in the middle of a write. But the striped array would
probably simply mean several trashed drives instead of one. It's a
known, guaranteed, this is what will kill a drive scenario, and an
array would just mean more drives at risk.

The place I came across a major discussion about this (I knew of the
issue earlier) said that some combo of Windows, update, and a certain
laptop was notorious for writing off drives. The update would flood the
cache, then the laptop would suspend. Cue one dead drive and, if within
warranty, one no-quibble replacement.

Cheers,
Wol
 
 -Original Message-
 From: u2-users-boun...@listserver.u2ug.org 
 [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Wols Lists
 Sent: Tuesday, October 02, 2012 4:20 AM
 To: u2-users@listserver.u2ug.org
 Subject: Re: [U2] [u2] Parallel processing in Universe
 
 On 02/10/12 03:49, Ross Ferris wrote:
 If the file were big enough, and already had part files, then I believe that 
 you could have a phantom process each of the individual parts. Failing that, 
 get an SSD  relatively cheap, and will give your processing a reasonable 
 kick along!!

 Just be careful with an SSD. If you have a power-fail in the middle of
 your process this sounds just like the scenario that will trash it. As
 in, totally dead no recovery possible.
 
 SSDs are great, but a power fail during write can take out the
 controller. One dead, irrecoverable disk. And if you're hammering the
 i/o you are VERY vulnerable.
 
 Cheers,
 Wol
 ___
 U2-Users mailing list
 U2-Users@listserver.u2ug.org
 http://listserver.u2ug.org/mailman/listinfo/u2-users
 ___
 U2-Users mailing list
 U2-Users@listserver.u2ug.org
 http://listserver.u2ug.org/mailman/listinfo/u2-users
 

___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] [u2] Parallel processing in Universe (Unclassified)

2012-10-02 Thread Doug Averch
Only outside of U2 using UniObjects can you achieve any type of parallel
activity. We have through UniObjects got 80 processes working from a single
Eclipse session through the use of threads in Java.

UniObjects creates individual uvapi_slave or udapi_slave for each of these
processes but the system or in this case the udapi_server or uvadpi_server
cannot handle as many threads as we would like.  We never ran out of memory
on our 8GB Windows 2008R2 Server nor did SSD 120GB drive fail to keep up
with the 80 ANALYZE.FILES or the 80 RESIZE commands we were issuing on from
our XLr8Resizer product within Eclipse.

The only way we got this working was to set the retries to 1000 on
reopening the connections.  Although that number seems high it helped and
get us from our previous best of 39 process to 80 process. When we have a
lot of time and cannot think of anything better to do we will try for 500
process.

Regards,
Doug
www.u2logic.com/tools.html
Eclipse based tools for the U2 programmer
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] [u2] Parallel processing in Universe

2012-10-02 Thread Wjhonson

Yes the low numbers are used more often.
However if you have sequential keys, just use the *last* two digits instead of 
the first two



-Original Message-
From: Wols Lists antli...@youngman.org.uk
To: u2-users u2-users@listserver.u2ug.org
Sent: Tue, Oct 2, 2012 1:17 am
Subject: Re: [U2] [u2] Parallel processing in Universe


On 01/10/12 22:47, Robert Houben wrote:
 Create an index on a dict pointing at the first character of the key, and 
 have 
each phantom take two digits. (0-1, 2-3, 4-5, 6-7, 8-9)
 
Actually, this is a very BAD way of chopping up a file into five even
chunks.

I'm not sure of the stats, but on any file with sequential keys, the
first phantom will get the majority of the records, the second get the
majority of what's left, etc etc.

A lot of people make the mistake of thinking this is a good technique.
I'm not even sure it works well with random numbers...

Cheers,
Wol
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

 
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] [u2] Parallel processing in Universe

2012-10-02 Thread David Wolverton
In my example, I would grab 'whatever' records were hashed in the to 'group'
-- while it's not perfect since there are 'overflow' - was just trying to
think of a way to break a file into pieces that would otherwise process much
like a BASIC select - just grab the 'group' and go  I can see it's
probably not possible, but the topic got me thinking about 'what if'...
(And we're UniData - so I have to apply that filter to most everything I
read on the list anyway G)

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David Taylor
Sent: Monday, October 01, 2012 6:10 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

Or, let's suppose you wanted to process repetitive segments of one very
large record using the same logic in a separate phantom process for each
segment, how large a record can be read and processed in Universe?

Dave

 So how would a user 'chop up' a file for parallel processing?  
 Ideally, if here was a Mod 10001 file (or whatever) it would seem like 
 it would be 'ideal' to assign 2000 groups to 5 phantoms -- but I don't 
 know how 'start a BASIC select at Group 2001 or 4001' ...

 -Original Message-
 From: u2-users-boun...@listserver.u2ug.org
 [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George 
 Gallen
 Sent: Monday, October 01, 2012 3:29 PM
 To: U2 Users List
 Subject: Re: [U2] [u2] Parallel processing in Universe

 0001: OPENSEQ /tmp/pipetest TO F.PIPE ELSE STOP NO PIPE
 0002: LOOP
 0003:READSEQ LINE FROM F.PIPE ELSE CONTINUE
 0004:PRINT LINE
 0005: REPEAT
 0006: STOP
 0007: END

 Although, not sure if you might need to sleep a litte between the 
 READSEQ's ELSE and CONTINUE
Might suck up cpu time when nothing is writing to the file.

 Then you could setup a printer in UV that did a  cat -  /tmp/pipetest

 Now your phantom just needs to print to that printer.

 George

 -Original Message-
 From: u2-users-boun...@listserver.u2ug.org
 [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George 
 Gallen
 Sent: Monday, October 01, 2012 4:16 PM
 To: U2 Users List
 Subject: Re: [U2] [u2] Parallel processing in Universe

 The only thing about a pipe is that once it's closed, I believe it has 
 to be re-opened by both Ends again. So if point a opens one end, and 
 point b opens the other end, once either end closes, It closes for 
 both sides, and both sides would have to reopen again to use.

 To eliminate this, you could have one end open a file, and have the 
 other sides do a  append To that file - just make sure you include 
 some kind of dataheader so the reading side knows which Process just wrote
the data.

 -Original Message-
 From: u2-users-boun...@listserver.u2ug.org
 [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of u2ug
 Sent: Monday, October 01, 2012 4:11 PM
 To: U2 Users List
 Subject: Re: [U2] [u2] Parallel processing in Universe

 pipes


 -Original Message-
 From: u2-users-boun...@listserver.u2ug.org
 [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Wjhonson
 Sent: Monday, October 01, 2012 4:05 PM
 To: u2-users@listserver.u2ug.org
 Subject: [U2] [u2] Parallel processing in Universe


 What's the largest dataset in the Universe user world?
 In terms of number of records.

 I'm wondering if we have any potential for utilities that map-reduce.
 I suppose you would spawn phantoms but how do they communicate back to 
 the master node?
 ___
 U2-Users mailing list
 U2-Users@listserver.u2ug.org
 http://listserver.u2ug.org/mailman/listinfo/u2-users


 ___
 U2-Users mailing list
 U2-Users@listserver.u2ug.org
 http://listserver.u2ug.org/mailman/listinfo/u2-users
 ___
 U2-Users mailing list
 U2-Users@listserver.u2ug.org
 http://listserver.u2ug.org/mailman/listinfo/u2-users
 ___
 U2-Users mailing list
 U2-Users@listserver.u2ug.org
 http://listserver.u2ug.org/mailman/listinfo/u2-users

 ___
 U2-Users mailing list
 U2-Users@listserver.u2ug.org
 http://listserver.u2ug.org/mailman/listinfo/u2-users



___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] [u2] Parallel processing in Universe

2012-10-02 Thread George Gallen
What if you created a duplicate file, did a SELECT and saved the list 
(non-sorted).

Each of the phantoms would do a getlist and loop through using readlist/readu 
 and if the record were already locked, skip it until it reads An unlocked 
record
 (and locks it). Delete the record when finished.



-Original Message-
From: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David Wolverton 
Sent: Tuesday, October 02, 2012 11:43 AM
To: 'U2 Users List'
Subject: Re: [U2] [u2] Parallel processing in Universe

In my example, I would grab 'whatever' records were hashed in the to 'group'
-- while it's not perfect since there are 'overflow' - was just trying to
think of a way to break a file into pieces that would otherwise process much
like a BASIC select - just grab the 'group' and go  I can see it's
probably not possible, but the topic got me thinking about 'what if'...
(And we're UniData - so I have to apply that filter to most everything I
read on the list anyway G)

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David Taylor
Sent: Monday, October 01, 2012 6:10 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

Or, let's suppose you wanted to process repetitive segments of one very
large record using the same logic in a separate phantom process for each
segment, how large a record can be read and processed in Universe?

Dave

 So how would a user 'chop up' a file for parallel processing?  
 Ideally, if here was a Mod 10001 file (or whatever) it would seem like 
 it would be 'ideal' to assign 2000 groups to 5 phantoms -- but I don't 
 know how 'start a BASIC select at Group 2001 or 4001' ...

 -Original Message-
 From: u2-users-boun...@listserver.u2ug.org
 [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George 
 Gallen
 Sent: Monday, October 01, 2012 3:29 PM
 To: U2 Users List
 Subject: Re: [U2] [u2] Parallel processing in Universe

 0001: OPENSEQ /tmp/pipetest TO F.PIPE ELSE STOP NO PIPE
 0002: LOOP
 0003:READSEQ LINE FROM F.PIPE ELSE CONTINUE
 0004:PRINT LINE
 0005: REPEAT
 0006: STOP
 0007: END

 Although, not sure if you might need to sleep a litte between the 
 READSEQ's ELSE and CONTINUE
Might suck up cpu time when nothing is writing to the file.

 Then you could setup a printer in UV that did a  cat -  /tmp/pipetest

 Now your phantom just needs to print to that printer.

 George

 -Original Message-
 From: u2-users-boun...@listserver.u2ug.org
 [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George 
 Gallen
 Sent: Monday, October 01, 2012 4:16 PM
 To: U2 Users List
 Subject: Re: [U2] [u2] Parallel processing in Universe

 The only thing about a pipe is that once it's closed, I believe it has 
 to be re-opened by both Ends again. So if point a opens one end, and 
 point b opens the other end, once either end closes, It closes for 
 both sides, and both sides would have to reopen again to use.

 To eliminate this, you could have one end open a file, and have the 
 other sides do a  append To that file - just make sure you include 
 some kind of dataheader so the reading side knows which Process just wrote
the data.

 -Original Message-
 From: u2-users-boun...@listserver.u2ug.org
 [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of u2ug
 Sent: Monday, October 01, 2012 4:11 PM
 To: U2 Users List
 Subject: Re: [U2] [u2] Parallel processing in Universe

 pipes


 -Original Message-
 From: u2-users-boun...@listserver.u2ug.org
 [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Wjhonson
 Sent: Monday, October 01, 2012 4:05 PM
 To: u2-users@listserver.u2ug.org
 Subject: [U2] [u2] Parallel processing in Universe


 What's the largest dataset in the Universe user world?
 In terms of number of records.

 I'm wondering if we have any potential for utilities that map-reduce.
 I suppose you would spawn phantoms but how do they communicate back to 
 the master node?
 ___
 U2-Users mailing list
 U2-Users@listserver.u2ug.org
 http://listserver.u2ug.org/mailman/listinfo/u2-users


 ___
 U2-Users mailing list
 U2-Users@listserver.u2ug.org
 http://listserver.u2ug.org/mailman/listinfo/u2-users
 ___
 U2-Users mailing list
 U2-Users@listserver.u2ug.org
 http://listserver.u2ug.org/mailman/listinfo/u2-users
 ___
 U2-Users mailing list
 U2-Users@listserver.u2ug.org
 http://listserver.u2ug.org/mailman/listinfo/u2-users

 ___
 U2-Users mailing list
 U2-Users@listserver.u2ug.org
 http://listserver.u2ug.org/mailman/listinfo/u2-users



___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman

Re: [U2] [u2] Parallel processing in Universe

2012-10-02 Thread David Wolverton
AH - would not even have to 'delete' as long as the 'locks' are held long
enough -- meaning if you know you will have 20 phantoms, each phantom would
keep a list of 'keys locked' and once it hits 21 (or 40 if you want
insurance LOL) in the list, would unlock earliest lock -- that way there is
no way any other phantom could process anything twice...

As each phantom runs, if it hits a locked record, it would move to the next
item in the list.

Great idea!

DW

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George Gallen
Sent: Tuesday, October 02, 2012 10:52 AM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

What if you created a duplicate file, did a SELECT and saved the list
(non-sorted).

Each of the phantoms would do a getlist and loop through using
readlist/readu  and if the record were already locked, skip it until it
reads An unlocked record  (and locks it). Delete the record when finished.



-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David Wolverton 
Sent: Tuesday, October 02, 2012 11:43 AM
To: 'U2 Users List'
Subject: Re: [U2] [u2] Parallel processing in Universe

In my example, I would grab 'whatever' records were hashed in the to 'group'
-- while it's not perfect since there are 'overflow' - was just trying to
think of a way to break a file into pieces that would otherwise process much
like a BASIC select - just grab the 'group' and go  I can see it's
probably not possible, but the topic got me thinking about 'what if'...
(And we're UniData - so I have to apply that filter to most everything I
read on the list anyway G)

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David Taylor
Sent: Monday, October 01, 2012 6:10 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

Or, let's suppose you wanted to process repetitive segments of one very
large record using the same logic in a separate phantom process for each
segment, how large a record can be read and processed in Universe?

Dave

 So how would a user 'chop up' a file for parallel processing?  
 Ideally, if here was a Mod 10001 file (or whatever) it would seem like 
 it would be 'ideal' to assign 2000 groups to 5 phantoms -- but I don't 
 know how 'start a BASIC select at Group 2001 or 4001' ...

 -Original Message-
 From: u2-users-boun...@listserver.u2ug.org
 [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George 
 Gallen
 Sent: Monday, October 01, 2012 3:29 PM
 To: U2 Users List
 Subject: Re: [U2] [u2] Parallel processing in Universe

 0001: OPENSEQ /tmp/pipetest TO F.PIPE ELSE STOP NO PIPE
 0002: LOOP
 0003:READSEQ LINE FROM F.PIPE ELSE CONTINUE
 0004:PRINT LINE
 0005: REPEAT
 0006: STOP
 0007: END

 Although, not sure if you might need to sleep a litte between the 
 READSEQ's ELSE and CONTINUE
Might suck up cpu time when nothing is writing to the file.

 Then you could setup a printer in UV that did a  cat -  /tmp/pipetest

 Now your phantom just needs to print to that printer.

 George

 -Original Message-
 From: u2-users-boun...@listserver.u2ug.org
 [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George 
 Gallen
 Sent: Monday, October 01, 2012 4:16 PM
 To: U2 Users List
 Subject: Re: [U2] [u2] Parallel processing in Universe

 The only thing about a pipe is that once it's closed, I believe it has 
 to be re-opened by both Ends again. So if point a opens one end, and 
 point b opens the other end, once either end closes, It closes for 
 both sides, and both sides would have to reopen again to use.

 To eliminate this, you could have one end open a file, and have the 
 other sides do a  append To that file - just make sure you include 
 some kind of dataheader so the reading side knows which Process just wrote
the data.

 -Original Message-
 From: u2-users-boun...@listserver.u2ug.org
 [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of u2ug
 Sent: Monday, October 01, 2012 4:11 PM
 To: U2 Users List
 Subject: Re: [U2] [u2] Parallel processing in Universe

 pipes


 -Original Message-
 From: u2-users-boun...@listserver.u2ug.org
 [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Wjhonson
 Sent: Monday, October 01, 2012 4:05 PM
 To: u2-users@listserver.u2ug.org
 Subject: [U2] [u2] Parallel processing in Universe


 What's the largest dataset in the Universe user world?
 In terms of number of records.

 I'm wondering if we have any potential for utilities that map-reduce.
 I suppose you would spawn phantoms but how do they communicate back to 
 the master node?
 ___
 U2-Users mailing list
 U2-Users@listserver.u2ug.org
 http://listserver.u2ug.org/mailman/listinfo/u2-users

Re: [U2] [u2] Parallel processing in Universe

2012-10-02 Thread Daniel McGrath
You've highlighted one problem here.

By having multiple processes accessing the disk in different locations, you 
destroy cache optimization and seek times. More phantoms = less performance. 
This assumes I/O is a bigger concern than CPU, which is generally the case.

More phantoms = more communication, which also adds another overhead that 
reduces performance.

Introducing more phantoms than CPU cores, you increase the amount of context 
switching, which ones again hurts your cache usage as well as adding bigger 
overheads on the CPU again.

In short, except for very specific cases, increasing 'concurrency' through 
phantoms on a single machine is generally ill-advised, resulting in longer 
processing times, higher average system loads and worse yet, greater system 
complexity (and hence ways for things to break).

As mentioned earlier, more system-level architectural changes (such as multiple 
machines, or at least, files storage on different disks/spindles for each 
process) are required if you want benefit from this sort of work.


-Original Message-
From: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David Wolverton 
Sent: Monday, October 01, 2012 4:47 PM
To: 'U2 Users List'
Subject: Re: [U2] [u2] Parallel processing in Universe

OK - I was trying to create a 'smoother use' of the disk and 'read ahead' -- 
this example the disk would be chattering from the heads moving all over the 
place. I was trying to find a way to make this process more 'orderly' -- is 
there one?

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Robert Houben
Sent: Monday, October 01, 2012 4:48 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

Create an index on a dict pointing at the first character of the key, and have 
each phantom take two digits. (0-1, 2-3, 4-5, 6-7, 8-9)

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David Wolverton
Sent: October-01-12 2:43 PM
To: 'U2 Users List'
Subject: Re: [U2] [u2] Parallel processing in Universe

So how would a user 'chop up' a file for parallel processing?  Ideally, if here 
was a Mod 10001 file (or whatever) it would seem like it would be 'ideal' to 
assign 2000 groups to 5 phantoms -- but I don't know how 'start a BASIC select 
at Group 2001 or 4001' ...

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George Gallen
Sent: Monday, October 01, 2012 3:29 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

0001: OPENSEQ /tmp/pipetest TO F.PIPE ELSE STOP NO PIPE
0002: LOOP
0003:READSEQ LINE FROM F.PIPE ELSE CONTINUE
0004:PRINT LINE
0005: REPEAT
0006: STOP
0007: END

Although, not sure if you might need to sleep a litte between the READSEQ's 
ELSE and CONTINUE
   Might suck up cpu time when nothing is writing to the file.

Then you could setup a printer in UV that did a  cat -  /tmp/pipetest

Now your phantom just needs to print to that printer.

George

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George Gallen
Sent: Monday, October 01, 2012 4:16 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

The only thing about a pipe is that once it's closed, I believe it has to be 
re-opened by both Ends again. So if point a opens one end, and point b opens 
the other end, once either end closes, It closes for both sides, and both sides 
would have to reopen again to use.

To eliminate this, you could have one end open a file, and have the other sides 
do a  append To that file - just make sure you include some kind of 
dataheader so the reading side knows which Process just wrote the data.

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of u2ug
Sent: Monday, October 01, 2012 4:11 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

pipes


-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Wjhonson
Sent: Monday, October 01, 2012 4:05 PM
To: u2-users@listserver.u2ug.org
Subject: [U2] [u2] Parallel processing in Universe


What's the largest dataset in the Universe user world?
In terms of number of records.

I'm wondering if we have any potential for utilities that map-reduce.
I suppose you would spawn phantoms but how do they communicate back to the 
master node?
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman

Re: [U2] [u2] Parallel processing in Universe

2012-10-02 Thread David Wolverton
Great point!!  I think we can agree that 'spinning media latency' is the
enemy and having phantoms increasing the 'head dance' can make things worse,
not better!  

Many problems go away or become trivial as the spinning media trails to the
sunset.  I've advised customers that just moving 'code files' to a tiny SSD
would likely increase overall system performance on Windows boxes.  Just
waiting until the price for Enterprise SSDs makes them a no-brainer...
Until then, even small SSDs will help!



-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Daniel McGrath
Sent: Tuesday, October 02, 2012 12:05 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

You've highlighted one problem here.

By having multiple processes accessing the disk in different locations, you
destroy cache optimization and seek times. More phantoms = less performance.
This assumes I/O is a bigger concern than CPU, which is generally the case.

More phantoms = more communication, which also adds another overhead that
reduces performance.

Introducing more phantoms than CPU cores, you increase the amount of context
switching, which ones again hurts your cache usage as well as adding bigger
overheads on the CPU again.

In short, except for very specific cases, increasing 'concurrency' through
phantoms on a single machine is generally ill-advised, resulting in longer
processing times, higher average system loads and worse yet, greater system
complexity (and hence ways for things to break).

As mentioned earlier, more system-level architectural changes (such as
multiple machines, or at least, files storage on different disks/spindles
for each process) are required if you want benefit from this sort of work.


-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David Wolverton 
Sent: Monday, October 01, 2012 4:47 PM
To: 'U2 Users List'
Subject: Re: [U2] [u2] Parallel processing in Universe

OK - I was trying to create a 'smoother use' of the disk and 'read ahead' --
this example the disk would be chattering from the heads moving all over the
place. I was trying to find a way to make this process more 'orderly' -- is
there one?

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Robert Houben
Sent: Monday, October 01, 2012 4:48 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

Create an index on a dict pointing at the first character of the key, and
have each phantom take two digits. (0-1, 2-3, 4-5, 6-7, 8-9)

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David Wolverton
Sent: October-01-12 2:43 PM
To: 'U2 Users List'
Subject: Re: [U2] [u2] Parallel processing in Universe

So how would a user 'chop up' a file for parallel processing?  Ideally, if
here was a Mod 10001 file (or whatever) it would seem like it would be
'ideal' to assign 2000 groups to 5 phantoms -- but I don't know how 'start a
BASIC select at Group 2001 or 4001' ...

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George Gallen
Sent: Monday, October 01, 2012 3:29 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

0001: OPENSEQ /tmp/pipetest TO F.PIPE ELSE STOP NO PIPE
0002: LOOP
0003:READSEQ LINE FROM F.PIPE ELSE CONTINUE
0004:PRINT LINE
0005: REPEAT
0006: STOP
0007: END

Although, not sure if you might need to sleep a litte between the READSEQ's
ELSE and CONTINUE
   Might suck up cpu time when nothing is writing to the file.

Then you could setup a printer in UV that did a  cat -  /tmp/pipetest

Now your phantom just needs to print to that printer.

George

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George Gallen
Sent: Monday, October 01, 2012 4:16 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

The only thing about a pipe is that once it's closed, I believe it has to be
re-opened by both Ends again. So if point a opens one end, and point b opens
the other end, once either end closes, It closes for both sides, and both
sides would have to reopen again to use.

To eliminate this, you could have one end open a file, and have the other
sides do a  append To that file - just make sure you include some kind
of dataheader so the reading side knows which Process just wrote the data.

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of u2ug
Sent: Monday, October 01, 2012 4:11 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

pipes


-Original Message-
From: u2-users-boun

Re: [U2] [u2] Parallel processing in Universe

2012-10-02 Thread Daniel McGrath
Yes, SSD will definitely help. Just keep in mind, it doesn't prevent all 
negatives in regards to I/O, particularly with regards to caching.

Disk caching in a modern system is fairly complex, but at the high level it is 
not only done by the controller, but by the OS as well. So randomly flying 
around the disk still cause cache thrashing. :(


-Original Message-
From: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David Wolverton 
Sent: Tuesday, October 02, 2012 11:19 AM
To: 'U2 Users List'
Subject: Re: [U2] [u2] Parallel processing in Universe

Great point!!  I think we can agree that 'spinning media latency' is the enemy 
and having phantoms increasing the 'head dance' can make things worse, not 
better!  

Many problems go away or become trivial as the spinning media trails to the 
sunset.  I've advised customers that just moving 'code files' to a tiny SSD 
would likely increase overall system performance on Windows boxes.  Just 
waiting until the price for Enterprise SSDs makes them a no-brainer...
Until then, even small SSDs will help!



-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Daniel McGrath
Sent: Tuesday, October 02, 2012 12:05 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

You've highlighted one problem here.

By having multiple processes accessing the disk in different locations, you 
destroy cache optimization and seek times. More phantoms = less performance.
This assumes I/O is a bigger concern than CPU, which is generally the case.

More phantoms = more communication, which also adds another overhead that 
reduces performance.

Introducing more phantoms than CPU cores, you increase the amount of context 
switching, which ones again hurts your cache usage as well as adding bigger 
overheads on the CPU again.

In short, except for very specific cases, increasing 'concurrency' through 
phantoms on a single machine is generally ill-advised, resulting in longer 
processing times, higher average system loads and worse yet, greater system 
complexity (and hence ways for things to break).

As mentioned earlier, more system-level architectural changes (such as multiple 
machines, or at least, files storage on different disks/spindles for each 
process) are required if you want benefit from this sort of work.


-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David Wolverton
Sent: Monday, October 01, 2012 4:47 PM
To: 'U2 Users List'
Subject: Re: [U2] [u2] Parallel processing in Universe

OK - I was trying to create a 'smoother use' of the disk and 'read ahead' -- 
this example the disk would be chattering from the heads moving all over the 
place. I was trying to find a way to make this process more 'orderly' -- is 
there one?

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Robert Houben
Sent: Monday, October 01, 2012 4:48 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

Create an index on a dict pointing at the first character of the key, and have 
each phantom take two digits. (0-1, 2-3, 4-5, 6-7, 8-9)

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David Wolverton
Sent: October-01-12 2:43 PM
To: 'U2 Users List'
Subject: Re: [U2] [u2] Parallel processing in Universe

So how would a user 'chop up' a file for parallel processing?  Ideally, if here 
was a Mod 10001 file (or whatever) it would seem like it would be 'ideal' to 
assign 2000 groups to 5 phantoms -- but I don't know how 'start a BASIC select 
at Group 2001 or 4001' ...

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George Gallen
Sent: Monday, October 01, 2012 3:29 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

0001: OPENSEQ /tmp/pipetest TO F.PIPE ELSE STOP NO PIPE
0002: LOOP
0003:READSEQ LINE FROM F.PIPE ELSE CONTINUE
0004:PRINT LINE
0005: REPEAT
0006: STOP
0007: END

Although, not sure if you might need to sleep a litte between the READSEQ's 
ELSE and CONTINUE
   Might suck up cpu time when nothing is writing to the file.

Then you could setup a printer in UV that did a  cat -  /tmp/pipetest

Now your phantom just needs to print to that printer.

George

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George Gallen
Sent: Monday, October 01, 2012 4:16 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

The only thing about a pipe is that once it's closed, I believe it has to be 
re-opened by both Ends again. So if point a opens one end, and point b opens 
the other end, once

Re: [U2] [u2] Parallel processing in Universe

2012-10-02 Thread Wjhonson

The idea of the phantoms would be to read the file in order, not randomly, just 
inorder from five different starting points.
So you should still get the benefit of some caching.



-Original Message-
From: Daniel McGrath dmcgr...@rocketsoftware.com
To: U2 Users List u2-users@listserver.u2ug.org
Sent: Tue, Oct 2, 2012 10:32 am
Subject: Re: [U2] [u2] Parallel processing in Universe


Yes, SSD will definitely help. Just keep in mind, it doesn't prevent all 
negatives in regards to I/O, particularly with regards to caching.

Disk caching in a modern system is fairly complex, but at the high level it is 
not only done by the controller, but by the OS as well. So randomly flying 
around the disk still cause cache thrashing. :(


-Original Message-
From: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org] 
On Behalf Of David Wolverton 
Sent: Tuesday, October 02, 2012 11:19 AM
To: 'U2 Users List'
Subject: Re: [U2] [u2] Parallel processing in Universe

Great point!!  I think we can agree that 'spinning media latency' is the enemy 
and having phantoms increasing the 'head dance' can make things worse, not 
better!  

Many problems go away or become trivial as the spinning media trails to the 
sunset.  I've advised customers that just moving 'code files' to a tiny SSD 
would likely increase overall system performance on Windows boxes.  Just 
waiting 
until the price for Enterprise SSDs makes them a no-brainer...
Until then, even small SSDs will help!



-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Daniel McGrath
Sent: Tuesday, October 02, 2012 12:05 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

You've highlighted one problem here.

By having multiple processes accessing the disk in different locations, you 
destroy cache optimization and seek times. More phantoms = less performance.
This assumes I/O is a bigger concern than CPU, which is generally the case.

More phantoms = more communication, which also adds another overhead that 
reduces performance.

Introducing more phantoms than CPU cores, you increase the amount of context 
switching, which ones again hurts your cache usage as well as adding bigger 
overheads on the CPU again.

In short, except for very specific cases, increasing 'concurrency' through 
phantoms on a single machine is generally ill-advised, resulting in longer 
processing times, higher average system loads and worse yet, greater system 
complexity (and hence ways for things to break).

As mentioned earlier, more system-level architectural changes (such as multiple 
machines, or at least, files storage on different disks/spindles for each 
process) are required if you want benefit from this sort of work.


-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David Wolverton
Sent: Monday, October 01, 2012 4:47 PM
To: 'U2 Users List'
Subject: Re: [U2] [u2] Parallel processing in Universe

OK - I was trying to create a 'smoother use' of the disk and 'read ahead' -- 
this example the disk would be chattering from the heads moving all over the 
place. I was trying to find a way to make this process more 'orderly' -- is 
there one?

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Robert Houben
Sent: Monday, October 01, 2012 4:48 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

Create an index on a dict pointing at the first character of the key, and have 
each phantom take two digits. (0-1, 2-3, 4-5, 6-7, 8-9)

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David Wolverton
Sent: October-01-12 2:43 PM
To: 'U2 Users List'
Subject: Re: [U2] [u2] Parallel processing in Universe

So how would a user 'chop up' a file for parallel processing?  Ideally, if here 
was a Mod 10001 file (or whatever) it would seem like it would be 'ideal' to 
assign 2000 groups to 5 phantoms -- but I don't know how 'start a BASIC select 
at Group 2001 or 4001' ...

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George Gallen
Sent: Monday, October 01, 2012 3:29 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

0001: OPENSEQ /tmp/pipetest TO F.PIPE ELSE STOP NO PIPE
0002: LOOP
0003:READSEQ LINE FROM F.PIPE ELSE CONTINUE
0004:PRINT LINE
0005: REPEAT
0006: STOP
0007: END

Although, not sure if you might need to sleep a litte between the READSEQ's 
ELSE 
and CONTINUE
   Might suck up cpu time when nothing is writing to the file.

Then you could setup a printer in UV that did a  cat -  /tmp/pipetest

Now your phantom just needs to print to that printer.

George

-Original Message-
From: u2

Re: [U2] [u2] Parallel processing in Universe

2012-10-02 Thread George Gallen
If 5 phantoms were running, and read in order but from 5 different starting 
points, the records would
Essentially still be processed in a random order, if you were to layout the 
record ID's as they get
Processed.

George

-Original Message-
From: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Wjhonson
Sent: Tuesday, October 02, 2012 1:35 PM
To: u2-users@listserver.u2ug.org
Subject: Re: [U2] [u2] Parallel processing in Universe


The idea of the phantoms would be to read the file in order, not randomly, just 
inorder from five different starting points.
So you should still get the benefit of some caching.


___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] [u2] Parallel processing in Universe

2012-10-02 Thread Wjhonson

The point of the caching concern is related to the read ahead, and you will 
still get some benefit from this, if your five phantoms are reading their 
*portion* of the file in order, which they should.



-Original Message-
From: George Gallen ggal...@wyanokegroup.com
To: U2 Users List u2-users@listserver.u2ug.org
Sent: Tue, Oct 2, 2012 10:39 am
Subject: Re: [U2] [u2] Parallel processing in Universe


If 5 phantoms were running, and read in order but from 5 different starting 
points, the records would
Essentially still be processed in a random order, if you were to layout the 
record ID's as they get
Processed.

George

-Original Message-
From: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org] 
On Behalf Of Wjhonson
Sent: Tuesday, October 02, 2012 1:35 PM
To: u2-users@listserver.u2ug.org
Subject: Re: [U2] [u2] Parallel processing in Universe


The idea of the phantoms would be to read the file in order, not randomly, just 
inorder from five different starting points.
So you should still get the benefit of some caching.


___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

 
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] [u2] Parallel processing in Universe

2012-10-02 Thread George Gallen
OK. I See what your saying...I'll buy that.

-Original Message-
From: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Wjhonson
Sent: Tuesday, October 02, 2012 1:42 PM
To: u2-users@listserver.u2ug.org
Subject: Re: [U2] [u2] Parallel processing in Universe


The point of the caching concern is related to the read ahead, and you will 
still get some benefit from this, if your five phantoms are reading their 
*portion* of the file in order, which they should.



-Original Message-
From: George Gallen ggal...@wyanokegroup.com
To: U2 Users List u2-users@listserver.u2ug.org
Sent: Tue, Oct 2, 2012 10:39 am
Subject: Re: [U2] [u2] Parallel processing in Universe


If 5 phantoms were running, and read in order but from 5 different starting 
points, the records would
Essentially still be processed in a random order, if you were to layout the 
record ID's as they get
Processed.

George

-Original Message-
From: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org] 
On Behalf Of Wjhonson
Sent: Tuesday, October 02, 2012 1:35 PM
To: u2-users@listserver.u2ug.org
Subject: Re: [U2] [u2] Parallel processing in Universe


The idea of the phantoms would be to read the file in order, not randomly, just 
inorder from five different starting points.
So you should still get the benefit of some caching.


___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

 
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] [u2] Parallel processing in Universe

2012-10-02 Thread David Wolverton
Which was my question -- was there a way to 'jump to' a group or 'BASIC
SELECT' with a'starting/ending' group -- so that again, 10001 moduo, one
phantom does 'groups' 1-2000, next phantom does 'groups' 2001-4000 etc...
But can't see that it's really possible without jumping through hoops that
make it unattractive at best!  At least on UniData!

DW

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George Gallen
Sent: Tuesday, October 02, 2012 12:55 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

OK. I See what your saying...I'll buy that.

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Wjhonson
Sent: Tuesday, October 02, 2012 1:42 PM
To: u2-users@listserver.u2ug.org
Subject: Re: [U2] [u2] Parallel processing in Universe


The point of the caching concern is related to the read ahead, and you will
still get some benefit from this, if your five phantoms are reading their
*portion* of the file in order, which they should.



-Original Message-
From: George Gallen ggal...@wyanokegroup.com
To: U2 Users List u2-users@listserver.u2ug.org
Sent: Tue, Oct 2, 2012 10:39 am
Subject: Re: [U2] [u2] Parallel processing in Universe


If 5 phantoms were running, and read in order but from 5 different starting
points, the records would Essentially still be processed in a random order,
if you were to layout the record ID's as they get Processed.

George

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org]
On Behalf Of Wjhonson
Sent: Tuesday, October 02, 2012 1:35 PM
To: u2-users@listserver.u2ug.org
Subject: Re: [U2] [u2] Parallel processing in Universe


The idea of the phantoms would be to read the file in order, not randomly,
just 
inorder from five different starting points.
So you should still get the benefit of some caching.


___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

 
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] [u2] Parallel processing in Universe

2012-10-02 Thread Wjhonson

You may not need to know what *group* you are in per se, if you are willing to 
use the file stats record.
You can determine from the last stats, how many records are in your file.

Then your master program just reads the keys until it gets to the 50,000th key 
(or whatever), and then spawns a phantom, telling it with which key to start, 
and how many keys to process before it ends.

Or maybe you don't need the stat file if Unidata has the @SELECTED to tell you 
how many keys there are


-Original Message-
From: David Wolverton dwolv...@flash.net
To: 'U2 Users List' u2-users@listserver.u2ug.org
Sent: Tue, Oct 2, 2012 10:59 am
Subject: Re: [U2] [u2] Parallel processing in Universe


Which was my question -- was there a way to 'jump to' a group or 'BASIC
SELECT' with a'starting/ending' group -- so that again, 10001 moduo, one
phantom does 'groups' 1-2000, next phantom does 'groups' 2001-4000 etc...
But can't see that it's really possible without jumping through hoops that
make it unattractive at best!  At least on UniData!

DW

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George Gallen
Sent: Tuesday, October 02, 2012 12:55 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

OK. I See what your saying...I'll buy that.

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Wjhonson
Sent: Tuesday, October 02, 2012 1:42 PM
To: u2-users@listserver.u2ug.org
Subject: Re: [U2] [u2] Parallel processing in Universe


The point of the caching concern is related to the read ahead, and you will
still get some benefit from this, if your five phantoms are reading their
*portion* of the file in order, which they should.



-Original Message-
From: George Gallen ggal...@wyanokegroup.com
To: U2 Users List u2-users@listserver.u2ug.org
Sent: Tue, Oct 2, 2012 10:39 am
Subject: Re: [U2] [u2] Parallel processing in Universe


If 5 phantoms were running, and read in order but from 5 different starting
points, the records would Essentially still be processed in a random order,
if you were to layout the record ID's as they get Processed.

George

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org]
On Behalf Of Wjhonson
Sent: Tuesday, October 02, 2012 1:35 PM
To: u2-users@listserver.u2ug.org
Subject: Re: [U2] [u2] Parallel processing in Universe


The idea of the phantoms would be to read the file in order, not randomly,
just 
inorder from five different starting points.
So you should still get the benefit of some caching.


___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

 
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

 
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] [u2] Parallel processing in Universe

2012-10-02 Thread Ross Ferris
Could also avoid the lock contention if each phantom had knowledge of the 
others, so phantom 1 could only process @ID 1, 6, 11 etc., phantom 2 would do 
2,7,12  so on

Of course, if you are operating with a select list, this already implies that 
you have processed the file once, so your batch process is actually a 
re-read, so in the absence of a suitable index, perhaps employing the 
Drumheller trick would be worth consideration 

Ross Ferris
Stamina Software
Visage  Better by Design!


-Original Message-
From: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Wjhonson
Sent: Wednesday, 3 October 2012 3:42 AM
To: u2-users@listserver.u2ug.org
Subject: Re: [U2] [u2] Parallel processing in Universe


The point of the caching concern is related to the read ahead, and you will 
still get some benefit from this, if your five phantoms are reading their 
*portion* of the file in order, which they should.



-Original Message-
From: George Gallen ggal...@wyanokegroup.com
To: U2 Users List u2-users@listserver.u2ug.org
Sent: Tue, Oct 2, 2012 10:39 am
Subject: Re: [U2] [u2] Parallel processing in Universe


If 5 phantoms were running, and read in order but from 5 different starting 
points, the records would Essentially still be processed in a random order, if 
you were to layout the record ID's as they get Processed.

George

-Original Message-
From: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org]
On Behalf Of Wjhonson
Sent: Tuesday, October 02, 2012 1:35 PM
To: u2-users@listserver.u2ug.org
Subject: Re: [U2] [u2] Parallel processing in Universe


The idea of the phantoms would be to read the file in order, not randomly, just 
inorder from five different starting points.
So you should still get the benefit of some caching.


___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

 
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] [u2] Parallel processing in Universe

2012-10-02 Thread Ross Ferris
Depends on what you call a no brainer -- to me, $4K for an 800Mb Intel 910 
SSD seems reasonable for what you get (10x full drive writes every day for 5 
years has the endurance angle covered IMHO - 400Gb is $2K if your database will 
fit)  and by todays standards represents reasonable value. Not quite at 
the performance level of Fusion IO, but cheap enough to just about be 
affordable

Ross Ferris
Stamina Software
Visage  Better by Design!

-Original Message-
From: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David Wolverton 
Sent: Wednesday, 3 October 2012 3:19 AM
To: 'U2 Users List'
Subject: Re: [U2] [u2] Parallel processing in Universe

Great point!!  I think we can agree that 'spinning media latency' is the enemy 
and having phantoms increasing the 'head dance' can make things worse, not 
better!  

Many problems go away or become trivial as the spinning media trails to the 
sunset.  I've advised customers that just moving 'code files' to a tiny SSD 
would likely increase overall system performance on Windows boxes.  Just 
waiting until the price for Enterprise SSDs makes them a no-brainer...
Until then, even small SSDs will help!



-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Daniel McGrath
Sent: Tuesday, October 02, 2012 12:05 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

You've highlighted one problem here.

By having multiple processes accessing the disk in different locations, you 
destroy cache optimization and seek times. More phantoms = less performance.
This assumes I/O is a bigger concern than CPU, which is generally the case.

More phantoms = more communication, which also adds another overhead that 
reduces performance.

Introducing more phantoms than CPU cores, you increase the amount of context 
switching, which ones again hurts your cache usage as well as adding bigger 
overheads on the CPU again.

In short, except for very specific cases, increasing 'concurrency' through 
phantoms on a single machine is generally ill-advised, resulting in longer 
processing times, higher average system loads and worse yet, greater system 
complexity (and hence ways for things to break).

As mentioned earlier, more system-level architectural changes (such as multiple 
machines, or at least, files storage on different disks/spindles for each 
process) are required if you want benefit from this sort of work.


-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David Wolverton
Sent: Monday, October 01, 2012 4:47 PM
To: 'U2 Users List'
Subject: Re: [U2] [u2] Parallel processing in Universe

OK - I was trying to create a 'smoother use' of the disk and 'read ahead' -- 
this example the disk would be chattering from the heads moving all over the 
place. I was trying to find a way to make this process more 'orderly' -- is 
there one?

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Robert Houben
Sent: Monday, October 01, 2012 4:48 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

Create an index on a dict pointing at the first character of the key, and have 
each phantom take two digits. (0-1, 2-3, 4-5, 6-7, 8-9)

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David Wolverton
Sent: October-01-12 2:43 PM
To: 'U2 Users List'
Subject: Re: [U2] [u2] Parallel processing in Universe

So how would a user 'chop up' a file for parallel processing?  Ideally, if here 
was a Mod 10001 file (or whatever) it would seem like it would be 'ideal' to 
assign 2000 groups to 5 phantoms -- but I don't know how 'start a BASIC select 
at Group 2001 or 4001' ...

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George Gallen
Sent: Monday, October 01, 2012 3:29 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

0001: OPENSEQ /tmp/pipetest TO F.PIPE ELSE STOP NO PIPE
0002: LOOP
0003:READSEQ LINE FROM F.PIPE ELSE CONTINUE
0004:PRINT LINE
0005: REPEAT
0006: STOP
0007: END

Although, not sure if you might need to sleep a litte between the READSEQ's 
ELSE and CONTINUE
   Might suck up cpu time when nothing is writing to the file.

Then you could setup a printer in UV that did a  cat -  /tmp/pipetest

Now your phantom just needs to print to that printer.

George

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George Gallen
Sent: Monday, October 01, 2012 4:16 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

The only thing about a pipe is that once it's closed, I believe it has

Re: [U2] [u2] Parallel processing in Universe

2012-10-02 Thread Robert Colquhoun
On Tue, Oct 2, 2012 at 5:58 PM, Symeon Breen syme...@gmail.com wrote:

 However map reduce and hadoop are pretty horrible things. Even Google have
 moved away from it with Caffiene etc.


Going OT a little, i think Google is replacing BigTable which was part of
Caffeine in 2010 with Spanner now.  Here is a doc about it released last
month:

http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en//archive/spanner-osdi2012.pdf

...amusingly they call it a Multi-Version Database, can't wait till that
gets abbreviated.

- Robert
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


[U2] [u2] Parallel processing in Universe

2012-10-01 Thread Wjhonson

What's the largest dataset in the Universe user world?
In terms of number of records.

I'm wondering if we have any potential for utilities that map-reduce.
I suppose you would spawn phantoms but how do they communicate back to the 
master node?
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] [u2] Parallel processing in Universe

2012-10-01 Thread u2ug
pipes


-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Wjhonson
Sent: Monday, October 01, 2012 4:05 PM
To: u2-users@listserver.u2ug.org
Subject: [U2] [u2] Parallel processing in Universe


What's the largest dataset in the Universe user world?
In terms of number of records.

I'm wondering if we have any potential for utilities that map-reduce.
I suppose you would spawn phantoms but how do they communicate back to
the master node?
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] [u2] Parallel processing in Universe

2012-10-01 Thread George Gallen
The only thing about a pipe is that once it's closed, I believe it has to be 
re-opened by both
Ends again. So if point a opens one end, and point b opens the other end, once 
either end closes,
It closes for both sides, and both sides would have to reopen again to use.

To eliminate this, you could have one end open a file, and have the other sides 
do a  append
To that file - just make sure you include some kind of dataheader so the 
reading side knows which
Process just wrote the data.

-Original Message-
From: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of u2ug
Sent: Monday, October 01, 2012 4:11 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

pipes


-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Wjhonson
Sent: Monday, October 01, 2012 4:05 PM
To: u2-users@listserver.u2ug.org
Subject: [U2] [u2] Parallel processing in Universe


What's the largest dataset in the Universe user world?
In terms of number of records.

I'm wondering if we have any potential for utilities that map-reduce.
I suppose you would spawn phantoms but how do they communicate back to
the master node?
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] [u2] Parallel processing in Universe

2012-10-01 Thread George Gallen
0001: OPENSEQ /tmp/pipetest TO F.PIPE ELSE STOP NO PIPE
0002: LOOP
0003:READSEQ LINE FROM F.PIPE ELSE CONTINUE
0004:PRINT LINE
0005: REPEAT
0006: STOP
0007: END

Although, not sure if you might need to sleep a litte between the READSEQ's 
ELSE and CONTINUE
   Might suck up cpu time when nothing is writing to the file.

Then you could setup a printer in UV that did a  cat -  /tmp/pipetest

Now your phantom just needs to print to that printer.

George

-Original Message-
From: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George Gallen
Sent: Monday, October 01, 2012 4:16 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

The only thing about a pipe is that once it's closed, I believe it has to be 
re-opened by both
Ends again. So if point a opens one end, and point b opens the other end, once 
either end closes,
It closes for both sides, and both sides would have to reopen again to use.

To eliminate this, you could have one end open a file, and have the other sides 
do a  append
To that file - just make sure you include some kind of dataheader so the 
reading side knows which
Process just wrote the data.

-Original Message-
From: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of u2ug
Sent: Monday, October 01, 2012 4:11 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

pipes


-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Wjhonson
Sent: Monday, October 01, 2012 4:05 PM
To: u2-users@listserver.u2ug.org
Subject: [U2] [u2] Parallel processing in Universe


What's the largest dataset in the Universe user world?
In terms of number of records.

I'm wondering if we have any potential for utilities that map-reduce.
I suppose you would spawn phantoms but how do they communicate back to
the master node?
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] [u2] Parallel processing in Universe

2012-10-01 Thread David Wolverton
So how would a user 'chop up' a file for parallel processing?  Ideally, if
here was a Mod 10001 file (or whatever) it would seem like it would be
'ideal' to assign 2000 groups to 5 phantoms -- but I don't know how 'start a
BASIC select at Group 2001 or 4001' ...

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George Gallen
Sent: Monday, October 01, 2012 3:29 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

0001: OPENSEQ /tmp/pipetest TO F.PIPE ELSE STOP NO PIPE
0002: LOOP
0003:READSEQ LINE FROM F.PIPE ELSE CONTINUE
0004:PRINT LINE
0005: REPEAT
0006: STOP
0007: END

Although, not sure if you might need to sleep a litte between the READSEQ's
ELSE and CONTINUE
   Might suck up cpu time when nothing is writing to the file.

Then you could setup a printer in UV that did a  cat -  /tmp/pipetest

Now your phantom just needs to print to that printer.

George

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George Gallen
Sent: Monday, October 01, 2012 4:16 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

The only thing about a pipe is that once it's closed, I believe it has to be
re-opened by both Ends again. So if point a opens one end, and point b opens
the other end, once either end closes, It closes for both sides, and both
sides would have to reopen again to use.

To eliminate this, you could have one end open a file, and have the other
sides do a  append To that file - just make sure you include some kind
of dataheader so the reading side knows which Process just wrote the data.

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of u2ug
Sent: Monday, October 01, 2012 4:11 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

pipes


-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Wjhonson
Sent: Monday, October 01, 2012 4:05 PM
To: u2-users@listserver.u2ug.org
Subject: [U2] [u2] Parallel processing in Universe


What's the largest dataset in the Universe user world?
In terms of number of records.

I'm wondering if we have any potential for utilities that map-reduce.
I suppose you would spawn phantoms but how do they communicate back to the
master node?
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] [u2] Parallel processing in Universe

2012-10-01 Thread Robert Houben
Create an index on a dict pointing at the first character of the key, and have 
each phantom take two digits. (0-1, 2-3, 4-5, 6-7, 8-9)

-Original Message-
From: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David Wolverton
Sent: October-01-12 2:43 PM
To: 'U2 Users List'
Subject: Re: [U2] [u2] Parallel processing in Universe

So how would a user 'chop up' a file for parallel processing?  Ideally, if here 
was a Mod 10001 file (or whatever) it would seem like it would be 'ideal' to 
assign 2000 groups to 5 phantoms -- but I don't know how 'start a BASIC select 
at Group 2001 or 4001' ...

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George Gallen
Sent: Monday, October 01, 2012 3:29 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

0001: OPENSEQ /tmp/pipetest TO F.PIPE ELSE STOP NO PIPE
0002: LOOP
0003:READSEQ LINE FROM F.PIPE ELSE CONTINUE
0004:PRINT LINE
0005: REPEAT
0006: STOP
0007: END

Although, not sure if you might need to sleep a litte between the READSEQ's 
ELSE and CONTINUE
   Might suck up cpu time when nothing is writing to the file.

Then you could setup a printer in UV that did a  cat -  /tmp/pipetest

Now your phantom just needs to print to that printer.

George

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George Gallen
Sent: Monday, October 01, 2012 4:16 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

The only thing about a pipe is that once it's closed, I believe it has to be 
re-opened by both Ends again. So if point a opens one end, and point b opens 
the other end, once either end closes, It closes for both sides, and both sides 
would have to reopen again to use.

To eliminate this, you could have one end open a file, and have the other sides 
do a  append To that file - just make sure you include some kind of 
dataheader so the reading side knows which Process just wrote the data.

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of u2ug
Sent: Monday, October 01, 2012 4:11 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

pipes


-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Wjhonson
Sent: Monday, October 01, 2012 4:05 PM
To: u2-users@listserver.u2ug.org
Subject: [U2] [u2] Parallel processing in Universe


What's the largest dataset in the Universe user world?
In terms of number of records.

I'm wondering if we have any potential for utilities that map-reduce.
I suppose you would spawn phantoms but how do they communicate back to the 
master node?
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] [u2] Parallel processing in Universe

2012-10-01 Thread David Wolverton
OK - I was trying to create a 'smoother use' of the disk and 'read ahead' --
this example the disk would be chattering from the heads moving all over the
place. I was trying to find a way to make this process more 'orderly' -- is
there one?

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Robert Houben
Sent: Monday, October 01, 2012 4:48 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

Create an index on a dict pointing at the first character of the key, and
have each phantom take two digits. (0-1, 2-3, 4-5, 6-7, 8-9)

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David Wolverton
Sent: October-01-12 2:43 PM
To: 'U2 Users List'
Subject: Re: [U2] [u2] Parallel processing in Universe

So how would a user 'chop up' a file for parallel processing?  Ideally, if
here was a Mod 10001 file (or whatever) it would seem like it would be
'ideal' to assign 2000 groups to 5 phantoms -- but I don't know how 'start a
BASIC select at Group 2001 or 4001' ...

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George Gallen
Sent: Monday, October 01, 2012 3:29 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

0001: OPENSEQ /tmp/pipetest TO F.PIPE ELSE STOP NO PIPE
0002: LOOP
0003:READSEQ LINE FROM F.PIPE ELSE CONTINUE
0004:PRINT LINE
0005: REPEAT
0006: STOP
0007: END

Although, not sure if you might need to sleep a litte between the READSEQ's
ELSE and CONTINUE
   Might suck up cpu time when nothing is writing to the file.

Then you could setup a printer in UV that did a  cat -  /tmp/pipetest

Now your phantom just needs to print to that printer.

George

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George Gallen
Sent: Monday, October 01, 2012 4:16 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

The only thing about a pipe is that once it's closed, I believe it has to be
re-opened by both Ends again. So if point a opens one end, and point b opens
the other end, once either end closes, It closes for both sides, and both
sides would have to reopen again to use.

To eliminate this, you could have one end open a file, and have the other
sides do a  append To that file - just make sure you include some kind
of dataheader so the reading side knows which Process just wrote the data.

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of u2ug
Sent: Monday, October 01, 2012 4:11 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

pipes


-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Wjhonson
Sent: Monday, October 01, 2012 4:05 PM
To: u2-users@listserver.u2ug.org
Subject: [U2] [u2] Parallel processing in Universe


What's the largest dataset in the Universe user world?
In terms of number of records.

I'm wondering if we have any potential for utilities that map-reduce.
I suppose you would spawn phantoms but how do they communicate back to the
master node?
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] [u2] Parallel processing in Universe

2012-10-01 Thread Wjhonson

The GROUP.STAT.DETAIL command will tell you the keys, in stored order, in each 
group of a hashed file.



-Original Message-
From: David Wolverton dwolv...@flash.net
To: 'U2 Users List' u2-users@listserver.u2ug.org
Sent: Mon, Oct 1, 2012 3:47 pm
Subject: Re: [U2] [u2] Parallel processing in Universe


OK - I was trying to create a 'smoother use' of the disk and 'read ahead' --
this example the disk would be chattering from the heads moving all over the
place. I was trying to find a way to make this process more 'orderly' -- is
there one?

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Robert Houben
Sent: Monday, October 01, 2012 4:48 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

Create an index on a dict pointing at the first character of the key, and
have each phantom take two digits. (0-1, 2-3, 4-5, 6-7, 8-9)

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David Wolverton
Sent: October-01-12 2:43 PM
To: 'U2 Users List'
Subject: Re: [U2] [u2] Parallel processing in Universe

So how would a user 'chop up' a file for parallel processing?  Ideally, if
here was a Mod 10001 file (or whatever) it would seem like it would be
'ideal' to assign 2000 groups to 5 phantoms -- but I don't know how 'start a
BASIC select at Group 2001 or 4001' ...

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George Gallen
Sent: Monday, October 01, 2012 3:29 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

0001: OPENSEQ /tmp/pipetest TO F.PIPE ELSE STOP NO PIPE
0002: LOOP
0003:READSEQ LINE FROM F.PIPE ELSE CONTINUE
0004:PRINT LINE
0005: REPEAT
0006: STOP
0007: END

Although, not sure if you might need to sleep a litte between the READSEQ's
ELSE and CONTINUE
   Might suck up cpu time when nothing is writing to the file.

Then you could setup a printer in UV that did a  cat -  /tmp/pipetest

Now your phantom just needs to print to that printer.

George

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George Gallen
Sent: Monday, October 01, 2012 4:16 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

The only thing about a pipe is that once it's closed, I believe it has to be
re-opened by both Ends again. So if point a opens one end, and point b opens
the other end, once either end closes, It closes for both sides, and both
sides would have to reopen again to use.

To eliminate this, you could have one end open a file, and have the other
sides do a  append To that file - just make sure you include some kind
of dataheader so the reading side knows which Process just wrote the data.

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of u2ug
Sent: Monday, October 01, 2012 4:11 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

pipes


-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Wjhonson
Sent: Monday, October 01, 2012 4:05 PM
To: u2-users@listserver.u2ug.org
Subject: [U2] [u2] Parallel processing in Universe


What's the largest dataset in the Universe user world?
In terms of number of records.

I'm wondering if we have any potential for utilities that map-reduce.
I suppose you would spawn phantoms but how do they communicate back to the
master node?
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

 
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] [u2] Parallel processing in Universe

2012-10-01 Thread u2ug
True - but why would you want it any other way ?
Once one end closes it - process is complete 

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George Gallen
Sent: Monday, October 01, 2012 4:16 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

The only thing about a pipe is that once it's closed, I believe it has
to be re-opened by both Ends again. So if point a opens one end, and
point b opens the other end, once either end closes, It closes for both
sides, and both sides would have to reopen again to use.

To eliminate this, you could have one end open a file, and have the
other sides do a  append To that file - just make sure you include
some kind of dataheader so the reading side knows which Process just
wrote the data.

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of u2ug
Sent: Monday, October 01, 2012 4:11 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

pipes


-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Wjhonson
Sent: Monday, October 01, 2012 4:05 PM
To: u2-users@listserver.u2ug.org
Subject: [U2] [u2] Parallel processing in Universe


What's the largest dataset in the Universe user world?
In terms of number of records.

I'm wondering if we have any potential for utilities that map-reduce.
I suppose you would spawn phantoms but how do they communicate back to
the master node?
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] [u2] Parallel processing in Universe

2012-10-01 Thread David Taylor
Or, let's suppose you wanted to process repetitive segments of one very
large record using the same logic in a separate phantom process for each
segment, how large a record can be read and processed in Universe?

Dave

 So how would a user 'chop up' a file for parallel processing?  Ideally, if
 here was a Mod 10001 file (or whatever) it would seem like it would be
 'ideal' to assign 2000 groups to 5 phantoms -- but I don't know how 'start
 a
 BASIC select at Group 2001 or 4001' ...

 -Original Message-
 From: u2-users-boun...@listserver.u2ug.org
 [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George Gallen
 Sent: Monday, October 01, 2012 3:29 PM
 To: U2 Users List
 Subject: Re: [U2] [u2] Parallel processing in Universe

 0001: OPENSEQ /tmp/pipetest TO F.PIPE ELSE STOP NO PIPE
 0002: LOOP
 0003:READSEQ LINE FROM F.PIPE ELSE CONTINUE
 0004:PRINT LINE
 0005: REPEAT
 0006: STOP
 0007: END

 Although, not sure if you might need to sleep a litte between the
 READSEQ's
 ELSE and CONTINUE
Might suck up cpu time when nothing is writing to the file.

 Then you could setup a printer in UV that did a  cat -  /tmp/pipetest

 Now your phantom just needs to print to that printer.

 George

 -Original Message-
 From: u2-users-boun...@listserver.u2ug.org
 [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George Gallen
 Sent: Monday, October 01, 2012 4:16 PM
 To: U2 Users List
 Subject: Re: [U2] [u2] Parallel processing in Universe

 The only thing about a pipe is that once it's closed, I believe it has to
 be
 re-opened by both Ends again. So if point a opens one end, and point b
 opens
 the other end, once either end closes, It closes for both sides, and both
 sides would have to reopen again to use.

 To eliminate this, you could have one end open a file, and have the other
 sides do a  append To that file - just make sure you include some kind
 of dataheader so the reading side knows which Process just wrote the data.

 -Original Message-
 From: u2-users-boun...@listserver.u2ug.org
 [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of u2ug
 Sent: Monday, October 01, 2012 4:11 PM
 To: U2 Users List
 Subject: Re: [U2] [u2] Parallel processing in Universe

 pipes


 -Original Message-
 From: u2-users-boun...@listserver.u2ug.org
 [mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Wjhonson
 Sent: Monday, October 01, 2012 4:05 PM
 To: u2-users@listserver.u2ug.org
 Subject: [U2] [u2] Parallel processing in Universe


 What's the largest dataset in the Universe user world?
 In terms of number of records.

 I'm wondering if we have any potential for utilities that map-reduce.
 I suppose you would spawn phantoms but how do they communicate back to the
 master node?
 ___
 U2-Users mailing list
 U2-Users@listserver.u2ug.org
 http://listserver.u2ug.org/mailman/listinfo/u2-users


 ___
 U2-Users mailing list
 U2-Users@listserver.u2ug.org
 http://listserver.u2ug.org/mailman/listinfo/u2-users
 ___
 U2-Users mailing list
 U2-Users@listserver.u2ug.org
 http://listserver.u2ug.org/mailman/listinfo/u2-users
 ___
 U2-Users mailing list
 U2-Users@listserver.u2ug.org
 http://listserver.u2ug.org/mailman/listinfo/u2-users

 ___
 U2-Users mailing list
 U2-Users@listserver.u2ug.org
 http://listserver.u2ug.org/mailman/listinfo/u2-users



___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


Re: [U2] [u2] Parallel processing in Universe

2012-10-01 Thread Ross Ferris
If the file were big enough, and already had part files, then I believe that 
you could have a phantom process each of the individual parts. Failing that, 
get an SSD  relatively cheap, and will give your processing a reasonable 
kick along!!

Ross Ferris
Stamina Software
Visage  Better by Design!


-Original Message-
From: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David Wolverton 
Sent: Tuesday, 2 October 2012 8:47 AM
To: 'U2 Users List'
Subject: Re: [U2] [u2] Parallel processing in Universe

OK - I was trying to create a 'smoother use' of the disk and 'read ahead' -- 
this example the disk would be chattering from the heads moving all over the 
place. I was trying to find a way to make this process more 'orderly' -- is 
there one?

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Robert Houben
Sent: Monday, October 01, 2012 4:48 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

Create an index on a dict pointing at the first character of the key, and have 
each phantom take two digits. (0-1, 2-3, 4-5, 6-7, 8-9)

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David Wolverton
Sent: October-01-12 2:43 PM
To: 'U2 Users List'
Subject: Re: [U2] [u2] Parallel processing in Universe

So how would a user 'chop up' a file for parallel processing?  Ideally, if here 
was a Mod 10001 file (or whatever) it would seem like it would be 'ideal' to 
assign 2000 groups to 5 phantoms -- but I don't know how 'start a BASIC select 
at Group 2001 or 4001' ...

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George Gallen
Sent: Monday, October 01, 2012 3:29 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

0001: OPENSEQ /tmp/pipetest TO F.PIPE ELSE STOP NO PIPE
0002: LOOP
0003:READSEQ LINE FROM F.PIPE ELSE CONTINUE
0004:PRINT LINE
0005: REPEAT
0006: STOP
0007: END

Although, not sure if you might need to sleep a litte between the READSEQ's 
ELSE and CONTINUE
   Might suck up cpu time when nothing is writing to the file.

Then you could setup a printer in UV that did a  cat -  /tmp/pipetest

Now your phantom just needs to print to that printer.

George

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George Gallen
Sent: Monday, October 01, 2012 4:16 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

The only thing about a pipe is that once it's closed, I believe it has to be 
re-opened by both Ends again. So if point a opens one end, and point b opens 
the other end, once either end closes, It closes for both sides, and both sides 
would have to reopen again to use.

To eliminate this, you could have one end open a file, and have the other sides 
do a  append To that file - just make sure you include some kind of 
dataheader so the reading side knows which Process just wrote the data.

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of u2ug
Sent: Monday, October 01, 2012 4:11 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

pipes


-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Wjhonson
Sent: Monday, October 01, 2012 4:05 PM
To: u2-users@listserver.u2ug.org
Subject: [U2] [u2] Parallel processing in Universe


What's the largest dataset in the Universe user world?
In terms of number of records.

I'm wondering if we have any potential for utilities that map-reduce.
I suppose you would spawn phantoms but how do they communicate back to the 
master node?
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users


___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users
___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org/mailman/listinfo/u2-users

___
U2-Users mailing list
U2-Users@listserver.u2ug.org
http://listserver.u2ug.org

Re: [U2] [u2] Parallel processing in Universe (Unclassified)

2012-10-01 Thread HENDERSON MIKE, MR
I have often thought about this - mostly in an idle moment or as a
displacement activity for something less amusing that I ought to be
doing. ;-)


First of all, Universe is already extremely parallel: there's a separate
O/S thread for each TTY and for each phantom, and you can't get more
parallel than that for interactive processing.

So you want more parallelism for your batch processes.
Different applications have different degrees of inherent parallelism.
For example in utility billing systems there is frequently the concept
of a group of premises - based on the old concept of a foot-borne meter
reader with a 'book' of readings to get. Each 'book' can be processed
independently of every other. In payroll, each employee's record can be
processed independently. Other areas of commerce have different
characteristics.

I think that whatever unit of parallelism you settle for, you'd need
three processes: a 'dispatcher' that selects records for processing and
queues them into some structure for processing; a set of 'workers' that
take queued work items, process them, mark them as processed and put the
results in some common store; and a 'monitor' that looks for unprocessed
records and indications of stuck processes, and collates the results for
final output.
I've seen a couple of versions of this, one for electricity billings and
another for overnight batch-processing of report requests, both well
over a decade ago, and neither still in use although their underlying
packages are still being run.

The major issue is that these days the whole entity in the general
commercial world is far more likely to be I/O limited than CPU limited,
and therefore introducing parallelism will be no help at all if the I/O
system is already choked.
Even if the system is currently CPU-limited, multi-threading may not
produce much improvement without very careful design of the record
locking philosophy - introducing parallelism will be no help if all the
threads end up contending serially for one record lock or a small set of
locks.


If you want it to go faster, buy the CPU with the fastest clock you can
get (not the one with the most cores), and put your database on SSD like
Ross said.
The Power7+ chips being announced any day now are rumoured to go to
5GHz+, maybe even more if you have half the cores on the chip disabled.


Regards


Mike

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Ross Ferris
Sent: Tuesday, 2 October 2012 3:50 p.m.
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

If the file were big enough, and already had part files, then I believe
that you could have a phantom process each of the individual parts.
Failing that, get an SSD  relatively cheap, and will give your
processing a reasonable kick along!!

Ross Ferris
Stamina Software
Visage  Better by Design!


-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David
Wolverton 
Sent: Tuesday, 2 October 2012 8:47 AM
To: 'U2 Users List'
Subject: Re: [U2] [u2] Parallel processing in Universe

OK - I was trying to create a 'smoother use' of the disk and 'read
ahead' -- this example the disk would be chattering from the heads
moving all over the place. I was trying to find a way to make this
process more 'orderly' -- is there one?

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Robert Houben
Sent: Monday, October 01, 2012 4:48 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

Create an index on a dict pointing at the first character of the key,
and have each phantom take two digits. (0-1, 2-3, 4-5, 6-7, 8-9)

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David
Wolverton
Sent: October-01-12 2:43 PM
To: 'U2 Users List'
Subject: Re: [U2] [u2] Parallel processing in Universe

So how would a user 'chop up' a file for parallel processing?  Ideally,
if here was a Mod 10001 file (or whatever) it would seem like it would
be 'ideal' to assign 2000 groups to 5 phantoms -- but I don't know how
'start a BASIC select at Group 2001 or 4001' ...

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of George Gallen
Sent: Monday, October 01, 2012 3:29 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

0001: OPENSEQ /tmp/pipetest TO F.PIPE ELSE STOP NO PIPE
0002: LOOP
0003:READSEQ LINE FROM F.PIPE ELSE CONTINUE
0004:PRINT LINE
0005: REPEAT
0006: STOP
0007: END

Although, not sure if you might need to sleep a litte between the
READSEQ's ELSE and CONTINUE
   Might suck up cpu time when nothing is writing to the file.

Then you could setup a printer in UV that did a  cat - 
/tmp/pipetest

Now your phantom

Re: [U2] [u2] Parallel processing in Universe (Unclassified)

2012-10-01 Thread Ross Ferris
Interestingly I'm currently trying to find  a definitive answer/correlation 
between clock speed  performance on a single core/thread on Intel CPU's to 
confirm, or deny, that for grunt batch work if a 4C Intel running @ 3.4Ghz will 
actually be faster than an 8C running @ 2.7Ghz -- the answer isn't as straight 
forward (or as easy to find) as I would have hoped, as even within the same 
family (e5-2600) there can be architectural differences that come into play 
 and if anyone has a definitive answer, please feel free to share !

Ross Ferris
Stamina Software
Visage  Better by Design!


-Original Message-
From: u2-users-boun...@listserver.u2ug.org 
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of HENDERSON MIKE, MR
Sent: Tuesday, 2 October 2012 1:18 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe (Unclassified)

I have often thought about this - mostly in an idle moment or as a displacement 
activity for something less amusing that I ought to be doing. ;-)


First of all, Universe is already extremely parallel: there's a separate O/S 
thread for each TTY and for each phantom, and you can't get more parallel than 
that for interactive processing.

So you want more parallelism for your batch processes.
Different applications have different degrees of inherent parallelism.
For example in utility billing systems there is frequently the concept of a 
group of premises - based on the old concept of a foot-borne meter reader with 
a 'book' of readings to get. Each 'book' can be processed independently of 
every other. In payroll, each employee's record can be processed independently. 
Other areas of commerce have different characteristics.

I think that whatever unit of parallelism you settle for, you'd need three 
processes: a 'dispatcher' that selects records for processing and queues them 
into some structure for processing; a set of 'workers' that take queued work 
items, process them, mark them as processed and put the results in some common 
store; and a 'monitor' that looks for unprocessed records and indications of 
stuck processes, and collates the results for final output.
I've seen a couple of versions of this, one for electricity billings and 
another for overnight batch-processing of report requests, both well over a 
decade ago, and neither still in use although their underlying packages are 
still being run.

The major issue is that these days the whole entity in the general commercial 
world is far more likely to be I/O limited than CPU limited, and therefore 
introducing parallelism will be no help at all if the I/O system is already 
choked.
Even if the system is currently CPU-limited, multi-threading may not produce 
much improvement without very careful design of the record locking philosophy - 
introducing parallelism will be no help if all the threads end up contending 
serially for one record lock or a small set of locks.


If you want it to go faster, buy the CPU with the fastest clock you can get 
(not the one with the most cores), and put your database on SSD like Ross said.
The Power7+ chips being announced any day now are rumoured to go to
5GHz+, maybe even more if you have half the cores on the chip disabled.


Regards


Mike

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Ross Ferris
Sent: Tuesday, 2 October 2012 3:50 p.m.
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

If the file were big enough, and already had part files, then I believe that 
you could have a phantom process each of the individual parts.
Failing that, get an SSD  relatively cheap, and will give your processing a 
reasonable kick along!!

Ross Ferris
Stamina Software
Visage  Better by Design!


-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David Wolverton
Sent: Tuesday, 2 October 2012 8:47 AM
To: 'U2 Users List'
Subject: Re: [U2] [u2] Parallel processing in Universe

OK - I was trying to create a 'smoother use' of the disk and 'read ahead' -- 
this example the disk would be chattering from the heads moving all over the 
place. I was trying to find a way to make this process more 'orderly' -- is 
there one?

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of Robert Houben
Sent: Monday, October 01, 2012 4:48 PM
To: U2 Users List
Subject: Re: [U2] [u2] Parallel processing in Universe

Create an index on a dict pointing at the first character of the key, and have 
each phantom take two digits. (0-1, 2-3, 4-5, 6-7, 8-9)

-Original Message-
From: u2-users-boun...@listserver.u2ug.org
[mailto:u2-users-boun...@listserver.u2ug.org] On Behalf Of David Wolverton
Sent: October-01-12 2:43 PM
To: 'U2 Users List'
Subject: Re: [U2] [u2] Parallel processing in Universe

So how would a user 'chop up' a file