[Haskell-cafe] Re: Seeking advice on a style question

2006-12-27 Thread apfelmus
Steve Schafer wrote:
> In my text/graphics formatting work, I find myself doing a lot of
> "pipeline" processing, where a data structure will undergo a number of
> step-by-step transformations from input to output. For example, I have a
> function that looks like this (the names have been changed to protect
> the innocent--and to focus on the structure):
> 
>  process :: a -> b -> c -> d -> e
>  process x1 x2 x3 x4 = 
>let y01   = f01 x1 x2 x3;
>y02   = f02 x1;
>y03   = f03 y02;
>y04   = f04 y03;
>y05   = f05 x1 y01 y04;
>y06   = f06 x2 y05;
>(y07,y08) = f07 y01 y06;
>y09   = f08 y07;
>(y10,y11) = f09 x2 x4 y09 y08;
>y12   = f10 y10;
>y13   = f11 y12;
>y14   = f12 x1 x2 x3 y01 y13;
>y15   = f13 y14;
>y16   = f14 y15 y11
>in y16
> [...]
> In principle, it could be
> managed with a bunch of nested StateT monads, but my attempts to do so
> seem to get so caught up in the bookkeeping that I lose the advantages
> mentioned above.
> [...]
> So here's the question: Is there a reasonable way to express this kind
> of process (where I suppose that "reasonable" means "in keeping with
> Haskell-nature") that preserves the advantages mentioned above, but
> avoids having to explicitly pass all of the various bits of state
> around?

To me, it looks more like MonadReader than MonadState because I have the
impression that x1 and x2 are "enivronments" to fetch something from.
(Btw, MonadReader is best used as an Applicative Functor, but that's a
different story).

But in general, it's futile trying to simplify things without knowing
their meaning: names are *important*. I assume that your proper goal is
not to structure pipeline processes in full generality, but to simplify
the current one at hand.

Even if you wanted to simplify the general structure, I think you'd have
to make the types of the different yk explicit. Otherwise, the problem
is underspecified and/or one has to assume that they're all different
(modulo some equalities implied by type correctness).


Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Seeking advice on a style question

2006-12-29 Thread apfelmus
>> I assume that your proper goal is not to structure pipeline processes
>> in full generality, but to simplify the current one at hand.
>
> No, I'm looking for full generality.  ;)
> I have dozens of these kinds of "quasi-pipelines," all similar in
> overall appearance, but different in detail.

Ah, the names help a lot and they confirm my uneasy feeling about the
quasi-pipeline: I think it's ad-hoc. What I want to say is that I
suggest refactoring the pipeline to become expressible as a point-free
function concatenation by decoupling data dependencies instead of trying
to find a way to express arbitrary interlinked quasi-pipelines. So I'd
eliminate the problem by switching to a different one :) Of course, this
decoupling needs the concrete names and types, that's why I wanted to
know them.

So, let's have look on the data dependencies, taking the information
from your web-site into account. I'll formulate some guesses and
questions, you don't need to comment on them; they're just meant as
hints. The point is that its *type* and not so much its name should
guide what a functions does. This way, dependencies on unnecessary
parameters automatically go away.


>> process :: Item -> MediaKind -> MediaSize -> Language -> SFO
"Item" doesn't tell me anything. Seems to be an XML-File containing the
questions and such.

>> process item mediaKind mediaSize language =
>>   let pagemaster = loadPagemaster item mediaKind mediaSize;

Mh, I cannot guess what a "pagemaster" might do, but from its arguments,
it looks like the "ink guy" responsible for actual printing (or
on-screen display). So he might know about graphics, colors and inches
but not about content.

>> validateQuestionContent :: [Question] -> [Question]
>>   questions = stripUndisplayedQuestions mediaKind $

Ok, mediaKind is indispensable because on-screen and print forms are
utterly different. Maybe one should write
  filter willBeDisplayedQuestion $
instead, but I think the name 'stripUndisplayedQuestions' says it all.

>>   appendEndQuestions item pagemaster $

Uh, why do questions depend on pagemaster and thus on mediaSize? Are
these some floating questions appearing on every page, like the name of
the guy to be questioned? Those should be treated somewhere else.
Here, one could also write
  (++ endquestions ...) $
so that everybody sees what's going on, but again 'appendEnd' is
adequate. The dependency on the full item is too much, I think.

>> coalesceParentedQuestions :: [Question] -> [Question]
>>   coalesceParentedQuestions $

This makes me suspicious whether [Question] is the right type. Apparently,
   data Question = GroupedQuestions [String]
or something like that, so that a Question may well be a tree of
questions. Guessing that this function resolves some tree structure that
got specified by explicitly naming nodes, I'd suggest a type
'[QuestionTaggedWithLevel] -> Tree Question' instead. Note that one also
has fold and filter on Trees for further processing.

>> validateQuestionContent :: [Question] -> [Question]
>>   validateQuestionContent $

Uh, I think the type is plain wrong. Doesn't the name suggest 'Question
-> Bool' and a fatal error when a question content is invalid?

>>   loadQuestions item;

'loadQuestions' is a strange name (too imperative) but that's personal
taste and maybe a hint to the fact that 'item' is stored inside a file.

>>  (numberedQuestions,questionCategories) = numberQuestions pagemaster 
>> questions;

Yet again the pagemaster. I don't think that mere numbering should
depend on mediaSize, not even implicitly. Why must questionCategories be
collected? Aren't they inherent in 'Tree Question', so that every Branch
has a unique category? Automatic numbering is fine, though.

>>  numberedQuestions' = coalesceNAQuestions numberedQuestions;
Does 'NA' mean not answered? Isn't that either a fatal error or a
Maybe-Answer? 'coalesce' makes me suspicious, I could live with a 'filter'.

>>  (bands,sequenceLayouts) = buildLayout mediaKind language 
>> numberedQuestions';
Ah, there's no pagemaster, only mediaKind and language, although the
pagemaster would be tempting here. I guess that layout builds for
'endless paper' (band).

>>  bands' = resolveCrossReferences bands;
Mh, cross reference for thing x on page y? But there aren't any pages
yet. Likely that I just don't know what bands are.

>>  groupedBands = groupBands bands';
(can't guess on that)

>>  pages = paginate item mediaKind mediaSize pagemaster groupedBands;
Now, the dependence on mediaSize is fine. But it's duplicated by pagemaster.

>>  pages' = combineRows pages;
>>  sfo = createSFO pages' sequenceLayouts;
>>  in sfo
(can't guess on that)


In summary, I think that the dependencies on the pagemaster are not
adequate, he mixes too many concerns that should be separated. If he
goes, there's much more function concatenation possible. If really
necessary, the MediaKi

[Haskell-cafe] Re: Seeking advice on a style question

2006-12-31 Thread apfelmus
>> In summary, I think that the dependencies on the pagemaster are not
>> adequate, he mixes too many concerns that should be separated.
>
> True, but then that's even more miscellaneous bits and pieces to carry
> around. I guess what makes me uncomfortable is that when I'm writing
> down a function like process1 (not its real name, as you might imagine),
> I want to concentrate on the high-level data flow and the steps of the
> transformation. I don't want to have to exposes all of the little bits
> and pieces that aren't really relevant to the high-level picture.
> Obviously, in the definitions of the functions that make up process1,
> those details become important, but all of that should be internal to
> those function definitions.

Yes, we want to get rid of the bits and pieces. Your actual code is
between two extremes that both manage to get rid of them. One extreme is
the "universal" structure like you already noted:

> Alternatively, I can wrap all of the state up into a single universal
> structure that holds everything I will ever need at every step, but
> doing so seems to me to fly in the face of strong typing; at the early
> stages of processing, the structure will have "holes" in it that don't
> contain useful values and shouldn't be accessed.

Currently, (pagemaster) has tendencies to become such a universal beast.
The other extreme is the one I favor: the whole pipeline is expressible
as a chain of function compositions via (.). One should be able to write

  process = rectangles2pages . questions2rectangles

This means that (rectangles2pages) comes from a (self written) layout
library and that (questions2rectangles) comes from a question formatting
library and both concern are completely separated from each other. If
such a factorization can be achieved, you get clear semantics, bug
reduction and code reuse for free.

Of course, the main problem is: the factorization does not arise by
coding, only by thinking. Often the situation is as following and I for
myself encounter it again and again: one starts with an abstraction
along function composition but it quickly turns out, as you noted, that
"there are some complicated reasons why that doesn't work". To get
working code, one creates some miniature "universal structure" that
incorporates all the missing data that makes the thing work. After some
time, the different concerns get more and more intertwined and soon,
every data depends on everything else until the code finally gets
unmaintainable, it became "monolithic".

What can be done? The original problem was that the solutions to the
originally separated concerns (layout library and questions2rectangles)
simply were not powerful, not general enough. The remedy is to
separately increase the power and expressiveness of both libraries until
the intended result can be achieved by plugging them together.
Admittedly, this is not an easy task. But the outcome is
rewarding: by thinking about the often ill-specified problems, one
understands them much better and it most often turns out that some
implementation details were wrong and so on. In contrast, the ad-hoc
approach that introduces miniature "universal structures" does not make
the libraries more general, but tries to fit them together by appealing
to the special case, the special problem at hand. In my experience, this
only makes things worse.
The point is: you have to implement the functionality anyway, so you may
as well grab some free generalizations and implement it once and for all
in an independent and reusable library.


I think that the following toy example (inspired by a discussion from
this mailing list) shows how to break intertwined data dependencies:

  foo :: Keyvalue -> (Blueprint, Map') -> (Blueprint', Map)
  foo x (bp,m') = (insert x bp, uninsert x bp m')

The type for (foo) is much too general: it says that foo may mix the
(Blueprint) and the (Map') to generate (Blueprint'). But this is not the
case, the type for foo introduces data dependencies that are not present
at all. A better version would be

  foo' :: Keyvalue -> Blueprint -> (Blueprint', Map' -> Map)
  foo' x bp = (insert x bp, \m' -> uninsert x bp m')

Here, it is clear that the resulting (Map) depends on (blueprint) and
(Map'), but that the resulting (Blueprint') does not depend on (map').
The point relevant to your problem is that one can use (foo') in more
compositional ways than (foo) simply because the type allows it. For
instance, you can recover (insert) from (foo'):

  insert :: Keyvalue -> Blueprint -> Blueprint'
  insert x bp = fst $ foo' x bp

but this is impossible with (foo).*

In the original problem, the type signature for (foo') was that best one
could get. But here, the best type signature is of course

  foo'' :: ( Keyvalue -> Blueprint -> Blueprint'
   , Keyvalue -> Blueprint -> Map' -> Map )
  foo'' = (insert, uninsert)

because in essence, (foo) is just the pair (insert, uninsert).

One morale from the above example is that functio

[Haskell-cafe] Re: Seeking advice on a style question

2007-01-07 Thread apfelmus
Steve Schafer wrote:
> [Apologies for the long delay in replying; I've been traveling, etc.]
[never mind]

>> The other extreme is the one I favor: the whole pipeline is expressible
>> as a chain of function compositions via (.). One should be able to write
>>
>>  process = rectangles2pages . questions2rectangles
>>
>> This means that (rectangles2pages) comes from a (self written) layout
>> library and that (questions2rectangles) comes from a question formatting
>> library and both concern are completely separated from each other. If
>> such a factorization can be achieved, you get clear semantics, bug
>> reduction and code reuse for free.
> 
> I favor that approach, too. ;) The problem is that when there is a
> multi-step process, and various bits of information get propagated
> throughout, as required by the various steps in the process, the overall
> decomposition into a series of steps a . b . c . ... can become brittle
> in the face of changing requirements.
> 
> Let's say, for example, a change request comes in that now requires step
> 13 to access information that had previously been discarded back at step
> 3. The simple approach is to propagate that information in the data
> structures that are passed among the intervening steps. But that means
> that all of the steps are "touched" by the change--because the relevant
> data structures are redefined--even though they're just passing the new
> data along.

Ah, I forgot to point it out, polymorphism is your dear friend, of
course. For example, 'rectangles2pages' should be fully polymorphic in
the stuff that's inside the rectangles, just like for instance 'nub ::
Eq a => [a] -> [a]' is polymorphic in the list elements. One possibility
is something like

   class Rectangle a where
   width, height :: a -> Integer

   type Pagenumber = Integer
   data Rectangle a => Pages a = Pages {
 stickyboxes   :: [(Position, a)]  -- appear on every page
   , pagenumberpos :: Position -- absolute numbering later
   , pages :: Data.Map Pagenumber [(Position, a)]
   -- actual contents
   }
   data Position = Position { x :: Integer, y :: Integer }

   data Footer a = Footer { content :: a, position :: HAlign }
   data HAlign = Left | Center | Right

   rectangles2pages :: Rectangle a => Footer a -> [a] -> Pages a

but there are many others. The type of 'rectangles2pages' dictates that
it can only rearrange but not alter the data inside 'a' (this is due to
_parametric_ polymorphism). Now, you may use it for normal text
processing via

   instance Rectangle Paragraph where

Or you can abuse the pagination algorithm to align bread, buns and
cookies on several tablets for baking in the stove:

   instance Rectangle Cookie where -- in the sense of bounding box

however you like it :) The point is that you will have minimal trouble
when requirements change if you somehow managed to keep
'rectangles2pages' as general as possible. The code above does not need
to be changed if you need to carry extra information around in 'a'.
That's what I meant with "(self written) layout library": for me, a
"library" is necessarily polymorphic.

Later on, you can specialize the types. So printing will be

   print :: Pages Paragraph -> Graphic

and it is clear that 'Pages Cookie' cannot be printed. A pipeline could
look like

   process = print . rectangles2pages footer .  ...
   where footer = loadFooter item

Note that I chose to still plumb a footer around, which makes sense in
case it is fully specified in the item. Of course, it can also be put in
pair '(Footer a, [a])':

   process = print . uncurry rectangles2pages .  ...

which is the way to go if it is generated on the fly by the previous
step in the pipeline. Given suitable generality, the right choice is
often natural and satisfying. And who says that there isn't an even
better generalization that incorporates the footer more elegantly?


Of course, the difficult thing is to discover the right generalization.
This can be quite an art. Oh, I can spent days of thinking without
writing a single line of code and I have never, ever encountered a
situation where it wasn't worth the effort. The point is: you have to
implement the corresponding general functionality anyway because often,
even the special case needs the full power in some way or another.
Implementing it for the special case only is like coding with a
blindfold. Still, the generalization can turn out to be inadequate, but
because we don't need to worry about the type 'a', things will be
easier. And, at some point, the interface doesn't need to change
anymore: every change makes it more general, there is a maximum
generality and we know that monotone sequences converge :)


>> Btw, the special place "end" suggests that the "question markup
>> language" does not incorporate all of: "conditional questions",
>> "question groups", "group templates"? Otherwise, I'd just let the user
>> insert
>>

Re: [Haskell-cafe] Re: Seeking advice on a style question

2006-12-28 Thread Steve Schafer
On Wed, 27 Dec 2006 17:06:24 +0100, you wrote:

>But in general, it's futile trying to simplify things without knowing
>their meaning: names are *important*. I assume that your proper goal is
>not to structure pipeline processes in full generality, but to simplify
>the current one at hand.

No, I'm looking for full generality. ;)

I have dozens of these kinds of "quasi-pipelines," all similar in
overall appearance, but different in detail.

>Even if you wanted to simplify the general structure, I think you'd have
>to make the types of the different yk explicit. Otherwise, the problem
>is underspecified and/or one has to assume that they're all different
>(modulo some equalities implied by type correctness).

Most of them are, in fact, different types (see my reply to Conal).



Here's the essence of the problem. If I have this:

 process1 x y =
   let u = foo x y;
   v = bar u;
   w = baz v
   in  w

I can easily rewrite it in point-free style:

 process1 = baz . bar . foo

But if I have this:

 process2 x y =
   let u = foo x y;
   v = bar u;
   w = baz v u
   in  w

then I can't avoid naming and using an intermediate variable. And that
annoys me. The u in process2 is of no more value to me (pardon the
pun) as the one in process1, but I am forced to use it simply because
the data flow is no longer strictly linear.

The reason I brought up monads as a possible means of managing this
problem is that the State, Reader and Writer monads already handle
certain specific "shapes" of nonlinear data flow, which suggested to
me that maybe there was a monadic approach to managing nonlinear data
flow in a more general way. Of course, if there is a non-monadic,
purely functional way to do it, that would be even better, but I've
never seen such a thing (short of doing lots of tupling and
un-tupling).

Steve Schafer
Fenestra Technologies Corp.
http://www.fenestra.com/
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Seeking advice on a style question

2006-12-29 Thread Greg Buchholz
Steve Schafer wrote:
> 
> Here's the essence of the problem. If I have this:
> 
>  process1 x y =
>let u = foo x y;
>v = bar u;
>w = baz v
>in  w
> 
> I can easily rewrite it in point-free style:
> 
>  process1 = baz . bar . foo

Not unless you have a much fancier version of function composition,
like...

http://okmij.org/ftp/Haskell/types.html#polyvar-comp

> 
> But if I have this:
> 
>  process2 x y =
>let u = foo x y;
>v = bar u;
>w = baz v u
>in  w
> 
> then I can't avoid naming and using an intermediate variable. And that
> annoys me. The u in process2 is of no more value to me (pardon the
> pun) as the one in process1, but I am forced to use it simply because
> the data flow is no longer strictly linear.
> 
> The reason I brought up monads as a possible means of managing this
> problem is that the State, Reader and Writer monads already handle
> certain specific "shapes" of nonlinear data flow, which suggested to
> me that maybe there was a monadic approach to managing nonlinear data
> flow in a more general way. Of course, if there is a non-monadic,
> purely functional way to do it, that would be even better, but I've
> never seen such a thing (short of doing lots of tupling and
> un-tupling).

-- Use combinators which automate the tupling/un-tupling.
-- See also, the Joy language...
-- http://www.latrobe.edu.au/philosophy/phimvt/joy/j00rat.html

main = process2 test

process2 = baz . bar . dup . foo

foo = mul . (push 2) . mul

bar = rep . swap . (push 'A')

baz = add . len 

test = (2,(3,()))::(Int,(Int,()))


dup (a,b) = (a,(a,b))
swap (a,(b,c)) = (b,(a,c))
push a b = (a,b)

lift1 f (a,b) = (f a,b)
lift2 f (a,(b,c)) = (f a b,c)

len = lift1 length 
add = lift2 (+)
mul = lift2 (*)
rep = lift2 replicate


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Seeking advice on a style question

2006-12-29 Thread Steve Schafer
On Fri, 29 Dec 2006 09:01:37 -0800, you wrote:

>Steve Schafer wrote:
>> 
>> I can easily rewrite it in point-free style:
>> 
>>  process1 = baz . bar . foo
>
>Not unless you have a much fancier version of function composition,
>like...
>
>http://okmij.org/ftp/Haskell/types.html#polyvar-comp


Sorry; I obviously got a little carried away there:

 process1 x =
   let u = foo x;
   v = bar u;
   w = baz v
   in  w

 process1 = baz . bar . foo

Steve Schafer
Fenestra Technologies Corp.
http://www.fenestra.com/
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Seeking advice on a style question

2006-12-29 Thread Udo Stenzel
Steve Schafer wrote:
> Here's the essence of the problem. If I have this:
> 
>  process1 x y =
>let u = foo x y;
>v = bar u;
>w = baz v
>in  w
> 
> I can easily rewrite it in point-free style:
> 
>  process1 = baz . bar . foo

That should have been

  process1 = (.) (baz . bar) . foo

or something similar.  You might want to define suitable combinators if
you have this pattern more often:

  infix 8 .<
  (.<) = (.) . (.)
  process1 = baz . bar .< foo


> But if I have this:
> 
>  process2 x y =
>let u = foo x y;
>v = bar u;
>w = baz v u
>in  w
> 
> then I can't avoid naming and using an intermediate variable.

Turns out you can.

  process2 = \x y -> (\u -> baz (bar u) u) (foo x y)
   = \x y -> (\u -> (baz . bar) u u) (foo x y)
   = \x y -> liftM2 (baz . bar) (foo x y)
   = liftM2 (baz . bar) .< foo

In fact, you never need named values.  Read "How to Mock a Mockingbird"
by Richard Bird (if memory serves) or the documentation for the Unlamda
(esoteric) programming language to find out how or let Lambdabot do the
transformation to pointless style for you.

You don't need to go fully points free in every case.  In your original
example, only one intermediate (y01) was actually used more than once
and deserves naming.  Everything else can be composed with the help of
'uncurry'.  'liftM2' is also surprisingly useful, but it's use at the
type constructor (r ->) as in the last example probably deserves a name
of its own.


> The u in process2 is of no more value to me (pardon the
> pun) as the one in process1, but I am forced to use it simply because
> the data flow is no longer strictly linear.

Instead you could define intermediate functions by composing functions
with the help of a few combinators.  I guess, the construction (liftM2
(baz . bar)) could have a very meaningful name.  Some combinators might
also vanish if you reorder and/or tuple some of the arguments to
existing functions.

 
> The reason I brought up monads as a possible means of managing this
> problem is that the State, Reader and Writer monads already handle
> certain specific "shapes" of nonlinear data flow

Uhm... that could be said of Reader, but is no good description of the
others.  If you like, you could plug your y01 into a Reader Monad, but I
don't think it simplifies anything.  Sometimes naming values is simply
the right thing to do.


-Udo
-- 
Even if you're on the right track, you'll get run over if you just sit there.
-- Will Rogers


signature.asc
Description: Digital signature
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Seeking advice on a style question

2006-12-30 Thread Steve Schafer
On Fri, 29 Dec 2006 21:01:31 +0100, you wrote:

>>> process :: Item -> MediaKind -> MediaSize -> Language -> SFO
>"Item" doesn't tell me anything. Seems to be an XML-File containing the
>questions and such.

The reason it's just "Item" is that it can be a number of different
things. It can be a full-blown questionnaire, composed of a number of
questions, but it could also be just one question (sometimes the users
want to see what a question layout looks like before okaying its
inclusion into the questionnaire stream). The functions are overloaded
to handle the various different kinds of Items.

>Mh, I cannot guess what a "pagemaster" might do, but from its arguments,
>it looks like the "ink guy" responsible for actual printing (or
>on-screen display). So he might know about graphics, colors and inches
>but not about content.

A pagemaster defines the sizes and locations of the various parts of the
page (top and bottom margins, left and right sidebars, body region), as
well as the content of everything except the body region (which is where
the questions go). There are four different page definitions in the
pagemaster: first page, last page, even page and odd page.

The pagemaster also contains a couple of other bits of information that
don't fit neatly anywhere else (discussed below).

>Maybe one should write
>  filter willBeDisplayedQuestion $
>instead, but I think the name 'stripUndisplayedQuestions' says it all.

Sure. "stripUndisplayedQuestions" is indeed just a simple filter.

>>>   appendEndQuestions item pagemaster $
>
>Uh, why do questions depend on pagemaster and thus on mediaSize? Are
>these some floating questions appearing on every page, like the name of
>the guy to be questioned? Those should be treated somewhere else.

End questions are questions that are inserted automagically at the end
of (almost) every questionnaire. They depend on the Item because only
questionnaires get them, and they depend on the pagemaster because not
every questionnaire gets them. (This is one of those additional bits of
information that is stored in the pagemaster. It may seem like it would
be better stored in the questionnaire itself, but there are some
complicated reasons why that doesn't work. Obviously, it would be
possible to rearrange the data after it is retrieved from the database,
although I'm not sure that there would be a net simplification.)

>>> coalesceParentedQuestions :: [Question] -> [Question]
>>>   coalesceParentedQuestions $
>
>This makes me suspicious whether [Question] is the right type. Apparently,
>   data Question = GroupedQuestions [String]
>or something like that, so that a Question may well be a tree of
>questions. Guessing that this function resolves some tree structure that
>got specified by explicitly naming nodes, I'd suggest a type
>'[QuestionTaggedWithLevel] -> Tree Question' instead. Note that one also
>has fold and filter on Trees for further processing.

Some questions are composed of multiple sub-questions that are treated
as separate questions in the database. Because the people who created
and maintain the database have difficulty fully grasping the concept of
trees (or hierarchies in general, actually), I have to jump through a
few hoops here and there to massage the data into something meaningful.

While it's true that a parent question looks superficially like a tree
of child questions, there's more to it than that; the visual layout of
the parent question is not generated by a simple traversal over its
children, for example. So, for all of the processing that follows, a
parent question (one with child questions) looks just like any other
question, and any parent question-specific details remain hidden inside.

>>> validateQuestionContent :: [Question] -> [Question]
>>>   validateQuestionContent $
>
>Uh, I think the type is plain wrong. Doesn't the name suggest 'Question
>-> Bool' and a fatal error when a question content is invalid?

No. The idea is to never fail to assemble the questionnaire. If there is
a question with invalid content, then it is replaced by a dummy question
that contains some descriptive text explaining the problem. So
"validateQuestionContent" might more loquaciously be called
"inspectTheQuestionsAndReplaceAnyThatDontLookRightWithAnErrorMessageShapedLikeAQuestion."

I haven't shown it here, but there is an accompanying Writer that
accumulates a log of errors and warnings as well. The final step
generates and prepends a "job ticket" page onto the output; the errors
and warnings are listed on that page.

>>>   loadQuestions item;
>
>'loadQuestions' is a strange name (too imperative) but that's personal
>taste and maybe a hint to the fact that 'item' is stored inside a file.

A database, actually. First, the item's details are retrieved, and
depending on what kind of item it is, a list of questions associated
with that item is retrieved. For example, if the item is a
questionnaire, things li

Re: [Haskell-cafe] Re: Seeking advice on a style question

2007-01-04 Thread Steve Schafer
[Apologies for the long delay in replying; I've been traveling, etc.]

On Sun, 31 Dec 2006 20:11:47 +0100, you wrote:

>The other extreme is the one I favor: the whole pipeline is expressible
>as a chain of function compositions via (.). One should be able to write
>
>  process = rectangles2pages . questions2rectangles
>
>This means that (rectangles2pages) comes from a (self written) layout
>library and that (questions2rectangles) comes from a question formatting
>library and both concern are completely separated from each other. If
>such a factorization can be achieved, you get clear semantics, bug
>reduction and code reuse for free.

I favor that approach, too. ;) The problem is that when there is a
multi-step process, and various bits of information get propagated
throughout, as required by the various steps in the process, the overall
decomposition into a series of steps a . b . c . ... can become brittle
in the face of changing requirements.

Let's say, for example, a change request comes in that now requires step
13 to access information that had previously been discarded back at step
3. The simple approach is to propagate that information in the data
structures that are passed among the intervening steps. But that means
that all of the steps are "touched" by the change--because the relevant
data structures are redefined--even though they're just passing the new
data along.

The less simple (and not always feasible) approach is to essentially
start over again and re-jigger all of the data structures and
subprocesses to handle the new requirement. But this can obviously
become quite a task.

>If there are only the cases of some single question or a full
>questionnaire, you could always do
>
>blowup :: SingleQuestion -> FullQuestionaire
>preview = process (blowup a_question) ...
>
>In general, I think that it's the task of (process) to inspect (Item)
>and to plug together the right steps. For instance, a single question
>does not need page breaks or similar. I would avoid overloading the
>(load*) functions and (paginate) on (Item).

A single question can be several pages long, so it does need to be
paginated. The reason for the decomposition as it now stands is that any
item (and there are more kinds of items than just questions and
questionnaires) can be decomposed into a pagemaster and a list of
questions. Once that has occurred, all items acquire essentially the
same "shape." That's why loading the pagemaster and loading the
questions are the first two steps in the process.

>Btw, the special place "end" suggests that the "question markup
>language" does not incorporate all of: "conditional questions",
>"question groups", "group templates"? Otherwise, I'd just let the user
>insert
>
>   
>  
>   
>
>at the end of every questionnaire. If you use such a tiny macro language
>(preferably with sane and simple semantics), you can actually merge
>(stripUndisplayedQuestions) and (appendEndQuestions) into a function
>(evalMacros) without much fuss.

If only I had the power to impose those kinds of changes

Unfortunately, I have little control over the logical organization of
questions, questionnaires and all of the other little bits and pieces.
(I assure you I would have done it quite differently if I could.)
Instead, I have to deal with an ad hoc pseudo-hierarchical
quasi-relational database structure, and to settle for occasional extra
columns to be added to the tables in order to specify information that I
can't synthesize any other way.

>Uh, that doesn't sound good. I assume that the post-processing is not
>implemented in Haskell?

Not even remotely so. ;) In the paper world, post-processing consists of
semi-automated collation and stapling of the actual printed pages. In
the electronic world, during previous survey periods, an analogous
process was used (a "front" questionnaire and a "back" questionnaire
would be figuratively stapled together); we're looking to make the
merging a bit smoother and more automatic this time around.

As is often the case, the motivation for the rather arcane
post-processing is human, rather than technical. Let's say I have ten
different questionnaires, where the first five pages of each
questionnaire are identical, and these are followed by six additional
pages that differ from one questionnaire to another. That's a total of
10 * 11 = 110 pages, but only 5 + 10 * 6 = 65 _distinct_ pages.

As hard as it may be to believe, the people who are responsible for
approving the questionnaires see it like this: If the system produces
one 5-page "front" questionnaire and ten 6-page "back" questionnaires,
then that's 65 pages that they have to inspect. But if the system were
to produce ten 11-page questionnaires, even though the first five pages
of each questionnaire are generated from exactly the same data using
exactly the same software, that's 110 pages that they have to inspect.

>Fine, though I don't see exactly why this isn't done before after the
>questions have

Re[2]: [Haskell-cafe] Re: Seeking advice on a style question

2006-12-29 Thread Bulat Ziganshin
Hello Steve,

Friday, December 29, 2006, 5:41:40 AM, you wrote:

> then I can't avoid naming and using an intermediate variable. And that
> annoys me. The u in process2 is of no more value to me (pardon the
> pun) as the one in process1, but I am forced to use it simply because
> the data flow is no longer strictly linear.

it force you to give names to intermediate results which is considered as
good programing style - program becomes more documented. alternatively, you
can give simple names like a b c



-- 
Best regards,
 Bulatmailto:[EMAIL PROTECTED]

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re[4]: [Haskell-cafe] Re: Seeking advice on a style question

2006-12-30 Thread Bulat Ziganshin
Hello Steve,

Friday, December 29, 2006, 8:10:29 PM, you wrote:

>>it force you to give names to intermediate results which is considered as
>>good programing style - program becomes more documented.

> But that would imply that function composition and in-line function
> definition are also Bad Style.

and omitting type signatures too :)  yes, to some degree. balance between
naked code and comments is your choice



-- 
Best regards,
 Bulatmailto:[EMAIL PROTECTED]

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Re[2]: [Haskell-cafe] Re: Seeking advice on a style question

2006-12-29 Thread Steve Schafer
On Fri, 29 Dec 2006 14:23:20 +0300, you wrote:

>it force you to give names to intermediate results which is considered as
>good programing style - program becomes more documented.

But that would imply that function composition and in-line function
definition are also Bad Style.

Steve Schafer
Fenestra Technologies Corp.
http://www.fenestra.com/
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe