In the world of OOP am I Hero or Heretic?

By Tony Marston

25th November 2004

Introduction
An innocent start
1. How do you separate Business and Data Access logic?
2. What is the benefit of separating logic into different layers?
3. Data Access Layer - Is there a real separation?
4. Information is not Logic just as Data is not Code
5. Can an object return more than one row?
6. What does 'encapsulation' really mean?
7. You are using the wrong design patterns
8. How easy would it be to change the database?
9. There is a dependency between your db schema and the presentation layer
10. You are arrogant and unwilling to learn
11. Your code smells and needs to be refactored
12. Your code is not OO, it is procedural
13. Your system is not 3-Tier
14. Here is an example of how it should be done
Conclusion

 

Introduction

Some people think I am a hero for daring to question the established views on object-oriented programming.

Some people think I am a heretic for daring to question the established views on object-oriented programming.

Ever since I started publishing my views and experiences on my personal website, or answered questions in newsgroups or other forums, people have fallen into one of three possible camps:

There are so many people out there with so many different opinions on how things should or should not be done that it is physically impossible to agree with everybody, so no matter what I do or say there will always be someone who thinks I am wrong. Such is life. However, there are some people who are so entrenched in their beliefs that they cannot stand the idea of anyone holding a different opinion. They are like religious fanatics who believe that theirs is the only god, the one true god, and that all disbelievers are heretics who should be burned at the stake. They scour the internet for disbelievers and, once found, they send in the today's equivalent of the Spanish Inquisition known as the 'Paradigm Police'.

Visitors to my website may already be familiar with What is/is not considered to be good OO programming which I wrote a year ago in response to those who thought that my approach to OOP was unacceptable, and it appears that the silly-season is here again as I am now forced to respond to a fresh outburst of similar 'paradigm persecution'. What started off quite innocently with rational arguments eventually degenerated into personal abuse which caused the site moderator to terminate the thread.

An innocent start

It all began with an entry in the Sitepoint PHP Blog written by Harry Fuecks in which he drew attention to my personal website with the following:

Just ran into A Development Infrastructure for PHP which discusses a generalized strategy which Tony uses when building applications with PHP. He clarifies some of the design decisions further in the FAQ.

Think Tony's done an excellent job - it's an end to end view of the approach he's evolved for building web applications and is also more or less a first - I've yet to see anyone be bold enough to describe a complete PHP architecture in this kind of detail. There's a lot of valuable insight in there and get the general impression of "sanity" (i.e. taking into account PHP's inherent advantages and disadvantages).

(Thank you Harry. The cheque's in the post)

This then became a series of questions and answers.

1. How do you separate Business and Data Access logic?

This question came from JNKlein:

I read (everything on) your site, Tony, and found it very enlightening - thank you.

A problem I continually have, when reading various literature on the subject or perusing sitepoint's forums or various informative web sites, is with the separation between the Business Logic and Data Access Logic.

Don't get me wrong - I get it; its a very worthwhile ideal to be able to switch the way you store your data and maintain the same business logic, or use your data in a new way by simply creating new components in the business logic without having to change the data access Logic.

But the two seem inexorably tied together in a practical and realistic sense. After all, what is your business logic without data to work with, and at that, should your business logic really be able to handle ANY data you pass to it? Is that healthy?

Let me use the example of the User class, something I'm sure we're all familiar with. Suppose one of the properties of a "user" is "username". You whip up your User class, then you whip up a UserMapper class, which has a method insert(User). The internals of this method inevitably make specific reference to the properties of a User object AND inevitably make specific reference to your method of storing the data (even with a DAO you need to specifically state where you are putting the info... "INSERT ... WHERE username = etc).

So now, instead of releasing your data access from the tied-down clutches of your business logic, you've tied it down to both the business logic AND the extremely specific data storage mechanism.

Perhaps what I am trying to understand is whether or not people are really serious when they talk about separation of logic. When you add a new property to your theoretical user, you're going to have to add it in the User class, then add a place to put it in the data storage mechanism, and lastly, you've got to change the functions that map your class to your data storage.

Perhaps I'm missing the point here - so please, set my disillusioned mind to rest by clarifying what I'm getting wrong here.

I do see the benefits in being able to use the User class in any number of ways and always being able to instantiate a UserMapper to insert(User), but what is the difference from having this code internal to the User class in a practical sense (as opposed to the "because OOP is right" reasoning).

I replied as follows:

The primary purpose of having a separate object in the data access layer (sometimes known as a Data Access Object or DAO) is that is should be possible to switch the entire application from one data source to another simply by changing this one component. Thus if I want to switch my application from MySQL to PostgreSQL (or whatever) I simply change my DAO.

In order to make this work in practice my own implementation is as follows:

(a) Each business entity (eg: customer, product, invoice) has its own class. This identifies the structure of the associated database table plus all the business rules required by that entity. Each of these is actually a subclass of a generic table class which contains sharable code that can be applied to any database table.

(b) When the business object gets instructions to update the database it does so via an insertRecord(), updateRecord(), or deleteRecord() method which contains the entire $_POST array. This is validated according to whatever rules have been defined within that particular subclass. If there are no errors it will talk to the relevant DAO in order to perform the database update.

(c) The DAO also has the insertRecord(), updateRecord() and deleteRecord() methods, but as well as the validated contents of the $_POST array it is also given a second array which contains all the table structure details. Using these two arrays it is easy to construct the relevant SQL query string before calling the relevant database API.

In this way my business object contains business rules, but no calls to database APIs, and my DAO contains calls to database APIs but no business rules. This is clear separation of logic.

Switching from one DBMS to another is simple to achieve in my infrastructure. In my generic table superclass I have a variable called $dbms_engine which is set to 'mysql' or 'postgresql' or whatever. This will then apply to all database tables unless overridden in any individual subclass. When the business object wants to talk to the data access object it first needs to instantiate an object from a class which is defined within a separate include() file. The name of this file is in the format 'dml.<engine>.class.inc' where <engine> is replaced by the contents of variable $dbms_engine. I have a separate version of this include() file for every DBMS that I use. All I need to do before accessing a new DBMS is to create a new version of the 'dml.<engine>.class.inc' file and I'm up and running.

Another advantage of this mechanism is that it would even be possible to talk to different database tables through different DBMS engines within the same transaction. Hows that for flexibility?

In case you want to see these (arrogant?) theories put into practice I have created a sample application which is described in http://www.tonymarston.net/php-mysql/sample-application.html. This contains links where you can run the application online as well as download all the source code and run it on your own machine. You can then examine the source code and tell me what I am doing wrong.

You also need to understand the difference between logic and information:

2. What is the benefit of separating logic into different layers?

This question came from Cochambre:

I've considered the logic layers separation in a Web Application many times. And the only thing that keeps me from effectively using it is that it's main function (independence of user interface, business rules and data storage/retrieval) only helps when migrating or extending to other script-language/data engine/platform. But this only happens very very few times in an Application Lifetime. In the other hand, this versatility has the inconvenience of not taking advantage of each platform/engine/language optimization benefits (which usually are not compatible between them), and this lowers the application performance, affecting the consequences directly to the users. So the question here is Versatility vs Performance. I believe that we must not punish the users by using this "development shortcuts". This, of course, is considering that you care about your application performance. (i'm sorry if I misspelled some words. I'm from Argentina)

I replied as follows:

Your view of the benefits of the 3 tier architecture are very narrow as in reality they are not restricted to changes in the scripting language, database engine or platform.

As my article is about building web applications with PHP, and PHP can run on many platforms, any argument about not being optimized for a particular platform is rather empty.

Being able to change from one database engine to another by changing just one component is not just a fancy expensive option that is rarely used. Take the case of MySQL, for example. For versions up to 4.0 you must use the mysql_* functions, but for 4.1 and above you must use the mysqli_* functions. How complicated would that be if you had hundreds of scripts to change instead of just one? You must also consider the case where a supplier creates an application which is then run on customers own machines with the database of their choice. If it is coded so that it only runs with MySQL but they actually want PostgreSQL or Oracle or whatever then how difficult would it be to cater for the customer's needs?

Having presentation logic separated from business logic has other advantages besides a switch to a totally different user interface device (for example, from client/server to the web). In the first place the creation of efficient, artistic and user-friendly web pages requires more than a passing knowledge of (X)HTML and CSS (and perhaps javascript) which a lot of PHP coders are without. The people with these skills may have little or no abilities with PHP, so by having separate layers you can have a different set of experts to deal with each layer. Another more common requirement is to have the ability to change the style of a web application with relative ease. By ensuring that all output is strict XHTML with all style specified in an external CSS stylesheet it is possible to change the entire 'look' of an application by changing a single CSS file.

In my infrastructure all my XHTML output is produced from a small set of generic XSL stylesheets, which means that should I need to make changes to my 350+ screens that cannot be done by altering the CSS file then all I have to do is change my generic XSL stylesheets, which are currently about 10 in number. You may think that such changes are rare, but what about when the time comes to convert your existing web application from HTML forms to XFORMS, the latest W3C standard? I can do that by changing 10 XSL stylesheets. Can you?

It was at this point that JNKlein decided to move this conversation from Harry's Blog into its own thread in the Sitepoint forums.

3. Data Access Layer - Is there a real separation?

To my question:

BTW, in your example you mentioned have a User class and a UserMapper class. Why two? I can put everything I need into a single class, which is what encapsulation is supposed to be about.

JNKlein replied:

if you have a User class that performs some business logic that doesn't interact with the database - suppose a hypothetical printUserName() method that just spits out the current $user->username, wouldn't you want a separate class that mapped a user to the database, either inserting or deleting or what-have-you? Then, if you needed to add more functionality on the business logic end, you would only change the User class (not the UserMapper class) to have another method, suppose printUserEmail(). This way, you can extend or refactor your User business logic (maintaining the same interface), without changing anything about the data access.

This is how I interpret the "separation of data access and business logic".

I most certainly would *NOT* create two classes for each entity, one which communicates with the database and one which does not, as this would break the principle of encapsulation which clearly states that the data for an entity and the operations that perform on that data should be in the same class. If a class wishes to communicate with the database then it passes control to the DAO. If it decides not to communicate with the database then it does not pass control to the DAO. I do not need a separate class to handle this simple decision.

He went on to say:

Maybe I'm not doing a good job of explaining my logic here - George Schlossangle says it well in his "Advanced PHP" chapter on this very subject (which, again, I recommend). Maybe someone else can clarify?

Regardless, my question to Tony is this - if you have your business logic and data access logic in the same class, can they be separate?

No. The business logic and data access logic have to be in separate classes, otherwise they are not separate. In my framework the business logic exists in no other place than the business layer, and the data access logic exists in no other place than the data access layer. You are making a typical mistake by confusing "logic" with "information". Logic is program code whereas information is data. The fact that my business class contains information about the database table which it represents is not the same as logic. The program code which uses this information to communicate with the physical database is contained in only one place, and that one place is the DAO. None of my business classes ever communicates directly with the database - if any communication is necessary they pass control to the DAO.

In reference to the business logic, Tony said "Each business entity (eg: customer, product, invoice) has its own class. This identifies the structure of the associated database table..." - I think this is the part I'm having trouble understanding, because where is the separation of logic if the business entity knows both the business logic and must know the structure of the associated data storage mechanism.

The structure of the associated database table is held as information (meta-data), not logic (program code). The only logic (program code) that exists inside the business entity is business logic. You will not find any data access logic anywhere else but inside the DAO in the data access layer. I do not have data access logic inside any business object, therefore I have not broken the principle of the separation of logic. To say otherwise shows that you do not understand what "logic" actually is. Refer to Information is not Logic just as Data is not Code for a more detailed explanation.

And now, a separate question - what is the point of a DAO (chose your favorite - ADODB or PEAR DB) that probably won't make it any easier to change what database you're using, since there is inevitably still hardcoded some query that doesn't work the same in mySQL, msSQL, PostgreSQL, and Oracle, let alone just two of the above. Just because you execute($query) doesn't mean $query will actually work.

I do not have any hard-coded queries anywhere in my framework. I do not construct the $query string inside the business layer then send it to the data access object to be executed. I send the user data and table structure meta-data to the DAO, and it is the DAO which constructs the $query string according to the requirements of the particular DBMS. For example, each DBMS requires different code to deal with an auto-increment column, or different ways of dealing with LIMIT and OFFSET for pagination, but because I have a separate class for each DBMS each of those classes contains whatever code is necessary to construct the correct query for that DBMS. How each DBMS works is the responsibility of the DBMS class, and there is no code in any business object which is tied to any particular DBMS.

At this point Version0-00e chipped in with this post:

Quote from JNKlein:

I think this is the part I'm having trouble understanding, because where is the separation of logic if the business entity knows both the business logic and must know the structure of the associated data storage mechanism.

Yer, can see where your going with this. But in my thinking, would the actual database table (columns in this case) actually change as well? Just because your moving to a new(er) database server?

I think in my view the separation for the most part is in removing the data source from the business logic.

On this same subject I'm looking into the same thing with Reflection, so I do not need to have the database table column names within a class, if I am thinking right anyways.

Surely you have getters and setters for each column? What about the code for business rules which must refer to each piece of data by its column name?

This came from JNKlein:

Quote from JNKlein:

To respond to Tony's question about why I separated the User and UserMapper class; if you have a User class that performs some business logic that doesn't interact with the database - suppose a hypothetical printUserName() method that just spits out the current $user->username, wouldn't you want a separate class that mapped a user to the database, either inserting or deleting or what-have-you? Then, if you needed to add more functionality on the business logic end, you would only change the User class (not the UserMapper class) to have another method, suppose printUserEmail(). This way, you can extend or refactor your User business logic (maintaining the same interface), without changing anything about the data access.

Surely if you have information or processing for a user contained in more than one class you are breaking encapsulation? There is no rule that says you must have one class which maps an object to a database and one which does not. I have all the information (properties and methods) for a USER in a single USER class. Within this class I may have a method such as insertUser() which calls the insertRecord() method on the DAO to add that data to the database, and I may have another method sendEmail() which sends the user an email. Just because the second method does not communicate with the database does not mean that I cannot include it with a method which does.

To follow the rules of encapsulation all the methods which deal with an object must be defined within a single class. It does not matter if one method talks to a database, one method sends an email, one method dials a telephone number and yet another method changes the television channel. The internals of each method are supposed to be irrelevant.


Quote from JNKlein:

if you have your business logic and data access logic in the same class, can they be separate?

Each database table class contains both business rules in the form of custom code and meta-data which identifies the table's physical structure as described in $fieldspec array. By containing all this information within a single class I am adhering to one of the fundamental principles of OOP which is encapsulation.

Although this information is defined within a business object it is not used to access the persistent data store (i.e. database) until it is passed to my Data Access Object (DML class). This uses the information given to it - the table structure meta-data and some user data - to construct the relevant query and then pass it to the specified database engine via the relevant API.

There is nothing in the rules of OOP that says I cannot define information in one object, then pass it to another for processing. It is where this information is actually processed which is important. My $fieldspec array actually contains information which is used in three different places:

  1. Some information is passed to a validation object to perform primary validation.
  2. Some information is passed to the XSL transformation to help build the HTML control for each field.
  3. Some information is passed to the DAO to communicate with the database.

If I were to define this information in three separate places surely this would break encapsulation?

Remember that my data access object contains no information about any database table whatsoever, so this information has to be passed to it from an external source as meta-data. This does not make the external source part of the data access object, now does it? Similarly the XSL stylesheet, which is used to construct the XHTML output, is useless without an XML file containing the data. This data originates from the business layer, but that does not make the business layer part of the XSL stylesheet, now does it?

If you are prepared to treat the term logic as program code which processes data (information) rather than data which is processed by program code you will see that my usage of the term 'separation of logic' is entirely justified whereas yours is questionable.


Quote from JNKlein:

What is the point of a DAO (chose your favorite - ADODB or PEAR DB) that probably won't make it any easier to change what database you're using, since there is inevitably still hardcoded some query that doesn't work the same in mySQL, msSQL, PostgreSQL, and Oracle, let alone just two of the above. Just because you execute($query) doesn't mean $query will actually work.

It depends on how you have designed your DAO to work. In my case my business object does not construct an SQL query then pass it to the DAO for processing. It passes the components of the query to the DAO, and it is up to the DAO to construct the actual query string according to the peculiarities of the particular DBMS. There is no code in any object in the business layer which is tied to any particular DBMS.

A point of clarification:

When I say that each business object identifies the structure of the associated database table I mean that it contains information (meta-data) about the structure of that table which is in a neutral format and not tied to any particular DBMS.

This information comes into play whenever user data passes through the business layer from the presentation layer on its way to the database, or from the database on its way to the presentation layer. Because it is not user data but information about user data, this information is sometimes referred to as meta-data. It is still not logic. Information is not logic - information is data while logic is code. Refer to Information is not Logic just as Data is not Code for more details.

This post came from seratonin:

Using your example of a data mapper between the domain model (User class) and the data model (database)... If you push the specific mapping code into a data mapper class you only have to make changes in one place. If there is a bi-directional dependency between the domain object and the data access object you have to make changes in two places. The business objects care about business logic and data not how to get the data. The data access objects care about how to get the data not what to do with it. It isn't just a logical separation of business logic and data access logic, it is a separation of concerns.

This raises several points:

  1. If I have a single business object instead of a domain object and a data mapper then I still only have to make changes in one place, so I am not losing out.
  2. When you say that "there is a bi-directional dependency between my business objects and my DAO" it is quite clear that you do not understand what the term "dependency" actually means. You can only say that "module A is dependent on module B" when there is a subroutine call from A to B. The business layer calls the data access layer, therefore the business layer is dependent on the data access layer. There is absolutely NO call made from the data access layer to the business layer, so to say that the data access layer has a dependency on the business layer would be totally, utterly, completely and absolutely wrong.
  3. Information regarding the structure of each database table exists as meta-data within that table's class, not within the DAO. I can therefore make any change I like to any database table without having to make any change to the DAO - all I do is update the table's meta-data in the relevant table class. When a business object wants to communicate with the database it calls a function on the DAO and passes both the user data and the meta-data. The DAO will then use this data to construct and execute the relevant SQL query.
  4. My business objects contain business logic but no database APIs. My DAO contains database APIs but no business logic. My business objects communicate with the DAO whenever they want any database activity. How the DAO satisfies each request is a mystery to the business object, so I have complete separation of logic.
  5. The statement It isn't just a logical separation of business logic and data access logic, it is a separation of concerns worries me as you are implying that the "separation of concerns" is not the same as the "separation of logic", which in turn implies that "concerns" includes information (data) as well as logic (program code). This is plainly wrong - see Information is not Logic just as Data is not Code for the reasons why.

4. Information is not Logic just as Data is not Code

There are some significant points from the previous item which are worth a more detailed explanation. Separation of logic is not the same as separation of information. This is where a little misinterpretation can cause a lot of confusion.

  1. Logic and information are not the same:

  2. Logic (code) is not the same as information (data). The two are totally different. It is code which processes or transforms data in some way. Data may pass through many layers in its journey between the database and the user, and there is different code (logic) in each layer which is responsible for processing or handling the data in a particular way:

    Note also that the presentation layer may be modified in order to output the data in a different format, such as CSV or PDF.

  3. The separation of logic means that the code which performs a particular type of processing should be separate from the logic which performs other types of processing. This is done by putting the logic (code) into a different component which usually exists in a different layer.

    Each layer has a single responsibility - HTML, business rules, or SQL - and each responsibility is carried out in a single layer. This is why the "separation of logic" is also known as the "separation of responsibilities".

  4. The fact that some information may be obtained from one layer and used in another layer does not mean that the source layer contains the other layer's logic. Logic is code, not data. Information is data, not code. Code is fixed and exists in one place whereas data is variable and can move between layers. When data is processed by code that data does not become part of the code as the code and data are still separate. When information is processed by logic that information does not become part of the logic as the logic and information are still separate.

  5. Separation of information cannot be achieved by layering. My infrastructure is based on a combination of the 3-Tier architecture and the Model-View-Controller design pattern which are both concerned with the separation of logic into different components in different layers. The separation of information into similar components or layers is pointless and irrelevant as data has to be able to pass through any of those layers in order to be processed by the logic (code) in those layers. Data has to be visible to the user via logic in the presentation layer, processed by business rules by logic in the business layer, and stored in and retrieved from the database by logic in the data access layer. The nearest you can get to separation of information is to have a different object in the business layer to deal with each different entity that the application has to deal with. Thus there would be a CUSTOMER object to deal with customer data, a PRODUCT object to deal with product data, and an ORDER object to deal with sales order data. Each business object would contain the business rules for that entity, and these business rules may exist as meta-data as well as code. This separation of information is actually a by-product of encapsulation which states that all the data for an entity/object should be placed in a single class. As far as I am concerned this "data" includes meta-data as well as user data.

5. Can an object return more than one row?

In this post the following point was raised by lazy_yogi:

if you want to get a list of users whose first name is 'John', do you use this below? Because logically does it make sense for a user to return a list/resultset of users?
$result_iterator = User::GetByFirstName('John'); 

To which I replied:

In my infrastructure I would use the following:
$object = new User; 
$where = "first_name='John'"; 
$data = $object->getData($where); 
Notice here that the $where string could be anything, so the getData() method is completely general purpose.

While it is true that Martin Fowler's Row Data Gateway pattern is limited to one instance per row, he also has the Table Data Gateway pattern where one instance can handle all the rows in the table.

This post was raised by seratonin:

The user (domain object) is not what should return a collection of User objects. This is the responsibility of the UserMapper:
<?php 
$mapper =& new UserMapper($db); 

// returns an array of User objects 
$users = $mapper->findUsersByName("John"); 

foreach ($users as $user) { 
    echo $user->getId(); 
} 
?> 

To which I replied as follows:

I totally disagree. All communication regarding a user, whether it be reading, inserting, updating or deleting, goes through a single USER object. It is then up to this object to decide how to satisfy the particular request. If you have to have a separate UserMapper class (which I would not) then surely this should be accessed from the User class itself?

Note that my User class does not access the database directly - it goes through a data access object which is responsible for generating the actual SQL query. Perhaps this serves the same functionality as your userMapper?

This brought the following reply from seratonin:

Well, if that is the case then you do not have a true domain model (at least in the PoEAA sense). Which is fine. Your implementation is closer to the Active Record pattern which mixes domain logic and data access logic.

I do *NOT* use the Active Record pattern as I do *NOT* mix domain logic and data access logic. All domain logic is contained within a business object while all data access logic is contained within a totally separate data access object (DAO). They *ARE* separate, they are *NOT* mixed.

This post came from Brenden Vickery:

After looking at Tony's example application, he uses no domain model that I can tell. Tony uses a kind of Table Data Gateway with a layer supertype that also acts as a sql query builder that uses meta data described in the TDG.

Using the User example, Tony doesn't make User objects that have properties like name, dob, etc. The User objects are only concerned about data access and validation of data being put in the database.

Which prompted this response from seratonin:

I can definitely see how he is using it as a Table Data Gateway. I guess it is more of a naming thing that was misleading me.

Yes, each business object fits the Table Data Gateway pattern as it can deal with more than one row from a database table, but all SQL queries are constructed and executed in a separate data access object.

This post was raised by Version0-00e:

The Row Data Gateway if I remember is for returning only the one row of data?

In that case, isn't this a Row Data Gateway,

$object = new User; 
$where = "first_name='John'"; 
$data = $object->getData($where); 
In which case is wrong no? Would need to look at Tony's script for myself to determine this though.

To which seratonin replied with this post:

The Row Data Gateway represents/manages one row in the database. The Table Data Gateway represents/manages an entire table. They are both low-level data access patterns. They operate on the data model exclusively. The Data Mapper pattern, however, is a higher-level pattern which maps between the data model and the domain model.

Let me make it quite clear that my design is not based on any patterns from Martin Fowler's PoEAA book. My design contains a mixture of various patterns which may or may not have direct counterparts in that book. The patterns that I use are the result of years of experience, not book reading.

This design works, therefore it is not wrong. Just because this design does not conform to your favourite patterns does not make it wrong (except in your eyes). It just means that I have not used your favourite patterns. That is not a crime (except in your eyes), it is a matter of personal choice.

6. What does 'encapsulation' really mean?

JNKlein first brought up the idea of having a User class and a UserMapper class to which I replied:

Surely if you have information or processing for a user contained in more than one class you are breaking encapsulation?

This brought the following response from lastcraft:

Encapsulation is about hiding implementation, which includes data. There is no requirement for every aspect of the "User" concept to be in a single class. In fact this is damaging, because such a kitchen sink class would be very inflexible. You are basically describing the Facade pattern, which is seldom used, never mind it being any kind of rule.

Flexible classes have a single role within the system, a concept called "cohesion". However you don't usually want every single behaviour of a concept in a separate class either. That would be overkill. For that reason we usually split the concept into just enough classes to do the job in the myriad ways we need.

A DataMapper splits persistence off from the domain object leaving both classes more cohesive. The price you pay is extra client code handling two objects. What you gain is divide and conquer on the complexity of the code. Smaller classes are easier to get right. You can also swap them around.

This last part caught my eye:

You can use the application with different databases just by choosing a different mapper at run time without touching any of the domain object code. This makes it easier to test as well.

So where you would use a mapper to switch from one database to another I would use a data access object. Isn't this the same thing but with a different name? I certainly don't need to use both a data mapper and a data access object.

I responded with the following:

Encapsulation means that the class must define all the properties and methods which are common to all objects of that class. All those properties and methods must exist inside a single container or 'capsule', and must not be distributed across multiple locations.

There is nothing in the principles of OOP which says that different aspects of an object must be contained within different classes, in fact it states quite the contrary. Therefore I consider your opinion to be totally wrong.

My understanding of the term 'encapsulation' is supported by Encapsulation is not information hiding which clearly states:

Encapsulation rule 1: Place data and the operations that perform on that data in the same class

Which caused Version0-00e to raise this post:

Therefore I consider your opinion to be totally wrong.
Now, that is arrogance, and as one person after following many posts by lastcraft, I'd have to disagree with your statement Tony.

Lastcraft has basically explained encapsulation, and from this statement is something I can take from it as I've near as damn it read something much along the same lines before elsewhere, Thinking In Java 3rd Edition if I remember?

Take it easy, you'll end up with a bad reputation, and as someone talking from experience it doesn't do you any good around this parts

My reply quickly followed:

How can I be arrogant for following the principle of encapsulation which specifically states that all methods and properties for an object should be encapsulated in a single class? Thus if I want to do anything with a User I invoke a method on a single User class. I do not have a separate class which involves database access and one which does not. It is up to the User class to decide for itself how to satisfy that method. If it involves accessing a database, then so be it. If it involves pulling data out of thin air, then so be it. How a method is implemented is supposed to be irrelevant.

If you believe that encapsulation means having properties and methods contained within more than one class then you are hopelessly wrong. If my bringing this to your attention makes me arrogant, then so be it. What does it make you?

This response came from lastcraft:

...the term 'encapsulation' is defined as...
You'll excuse me if I don't go searching an entire site for one out of context quote . The problem with quoting a beginner's tutorial is that you will get a deliberately simplified picture. An attitude that claims expert status on the basis of quoting a beginners tutorial is something I won't even go into on what is normally a very polite forum.

I have never claimed to be an OOP expert. I have never claimed that my infrastructure is 100% OO. All that I have stated is that I have 'made use of the OO capabilities of PHP' in building my infrastructure. Note that it will run in PHP 4 as well as PHP 5.

And what is this 'quoting a beginner's tutorial' lark? The basic principles of OOP - that of encapsulation, inheritance and polymorphism - are supposed to remain the same whether you are a novice or an expert. If they keep changing every 5 minutes then no programmer stands a snowball's chance in Hell of ever getting it right.

How you choose and name a class is a very complex design problem. Everything after that is easy. You may have an idea that the piece of code you are writing is something to do with users, but your program needs to be more precise. You have to decompose that vague idea into specific roles to which you can assign responsibilities. A User could also be described as Person plus AccessKey for example (this is actually advisable, but that's a whole other topic). Immediately we have decomposed User into more refined classes just by taking a different viewpoint. No different than if we split it into User and UserMapper.

It appears that one of the biggest problems that people have with OOP is deciding amongst the following:

I have built many successful systems in the past based on nothing more than a Entity-Relationship Diagram (ERD), so that is where I started with my OOP project. Each entity that the application needs to deal with, such as Customer, Product and Invoice, becomes a table in the database and an object in the software. It's simple and it works, so why make it more complicated that it need be?

Your concept of a class is rather naive.

Or perhaps your concept of a class is more complex than it should really be.

Here are all of the GOF design patterns that would break encapsulation according to your imaginary "rule": AbstractFactory, Builder, FactoryMethod, Adapter, Bridge, Composite, Mediator, Memento, Strategy, Visitor.

To list all of the enterprise patterns that you would not be able to use would take all day. From the top of my head that is a large part of Fowler, Nock, Evans and Beck and that's just from looking behind me at my bookshelf.

And what is their justification for breaking encapsulation and spreading knowledge about an object over multiple classes? How is this supposed to make the programmer's job easier if he has to decide which class to use to carry out which function?

There is nothing in the principles of OOP which says that different aspects of an object must be contained within different classes, in fact it states quite the contrary.
There is nothing to say things have to be split into classes any more than there is your imaginary rule that they cannot. It is often convenient and more flexible to do so.

That is not an opinion I share.

It's not a matter of opinion, it's a matter of depth of understanding. That's a lot of hard work.

It has always been my opinion that the best solution to a problem is the simplest, which is why I follow the KISS principle. Making something more complex than it need be is like attempting to push a piece of string - very hard work!

I would never dream of claiming that one personal form of layering was the correct architecture for every site and problem. I would consider it silly, because I know the real world is a lot more subtle and complex than it first appears. When I approach a site architecture problem I come armed with a whole bunch of solutions, which I gather voraciously, ready to weigh the pros and cons of each.

So what you are actually saying then is that this 'one size fits all' concept is not actually valid, that different applications may be approached in different ways, that there is a huge variety of possible solutions available from which I am able to pick and choose as I see fit?

I don't look for the first quote on the web that I can find, misunderstand it, and then use it to eliminate whole families of solutions.

What is there to misunderstand about 'all the methods and properties for an object are encapsulated in a single class'?

This post came from JNKlein:

about this encapsulation usage - where do you draw the line? When does your User class begin to contain too much and become a "god" class. Why don't you have your View and layout properties within the User class? (I know the answer to that question - just posing it as a hypothetical given the nature of encapsulation.) And so on ... after all, isn't your whole application one entity that could be stuffed into a single class?

According to my understanding a god class is one that has too many responsibilities, such as a combination of presentation logic, business logic and data access logic, or too much data, such as a combination of Product, Customer and Order data. As I have separate objects which deal with each of those responsibilities (that is why I chose the 3-tier architecture) and separate classes for each entity (such as Product, Customer and Order) I have absolutely nothing which approaches the definition of a god class.

This idea was echoed by DougBTX:

That definition does not explain why you should put everything related to a user in a single class. It is like saying that all the code in the application should go an an Application class.

The idea of putting the whole application into a single class sounds as sensible as putting all your application data into a single database table. If I have a separate database table for each entity is it not sensible to have a separate class for each entity? If I only have a single table which holds User data then why can't I have a single User class?

How do you get your head around a separation between data manipulation (insertRecord() in User class) and business logic (printUserName() in User class) within the same class. In a related question - how do you structure your library to account for this combination?

Simple. I would never have a printUserName() method in the User class. A page controller retrieves data from a business object (in this example the User class), then gives it to a View which may produce either HTML, PDF, CSV or whatever output. The User class simply returns data without knowing what will be done with it. The business object does not contain any formatting logic as that is the responsibility of the presentation layer. The business object simply receives a request and responds with data, and what happens to that data afterwards is none of its concern.

This post was added by lastcraft:

If you consider a DataMapper to be breaking encapsulation then your view of it is simplistic. It actually achieves greater encapsulation by making the the data access signatures invisible to the domain. It also moves the creation of the data access classes from the domain to the application code where configuration usually takes place, thus adding a flex point in the place it needs to be. The DataAccessor pattern buries the choice of data access inside the domain objects. If you ever want to change the DB, although you won't usually, then you are in trouble.

Changing the DBMS is no trouble at all in my infrastructure, as discussed here.

The only way to change the choice of database without editing the domain layer (using DataAccessors) would be to have some hidden configuration object working behind the scenes. This would mean that the code that controls this class's behaviour would appear in two separate places, the domain object and the configuration. This means you could secretly change the implicit behaviour (domain) with an external interface (configuration) without going through the explicit interface. This would be "bad" design because it would break encapsulation.

In my infrastructure I have the ability to switch from one DBMS to another simply by changing a single object, the DAO. My implementation of the DAO does not break encapsulation as the DAO does not contain any information about the structure of any database table in the application. This information is defined within each business object as meta-data and is passed to the DAO at runtime when the business object requires to communicate with the database. I do not consider that my approach breaks encapsulation as the information about each business object is held within that object. The fact that this information includes meta-data should not be a problem.

Here's an opinion expressed by lazy_yogi which does not seem too far from my own:

I also feel that encapsulation is one of the most important rules, and splitting up classes does add a layer of complexity. I don't know if one way is 'better' than the other, but I prefer the simpler method of a single class, and only feel a separate mapper class is required for very complex systems with a badly designed legacy database design that needs complex mapping between objects and the database.

People that ignore the KISS principle can end up with a horrid system that is horribly difficult to use and maintain. I've seen systems that are worse than procedural code with globals all over the damn place. And at the end of the day, the point of design and OOP is for easy maintenance.

Here's another opinion from dagfinn:

I have to agree with Marcus that this is not a black and white issue. But you have a point, and it's about the relation between a conceptual object and a syntactical one. There are reasons to try to avoid fragmenting a conceptual object too much. I believe this is part of the reason why J2EE has been losing popularity, that it causes just this kind of fragmentation.

This post was raised by lastcraft:

If I have an entity called Customer then I will put all information, properties and methods, for that entity in a single class. This is what encapsulation is all about. I do not see the point of breaking that up into smaller classes, each containing a subset of information.
This is so idiotic it doesn't even stand on it's own terms. If a Customer has an employer then I have to make all messages to the Employer class go through the Customer? If I write code to bulk mail my customers I have to pass in the whole Customer class rather than just a Contact object. That's your idea of encapsulation?

It is you who are confusing the issue by including new entities Employer and Contact into the equation. A Customer may be an Employer, and a Customer may have Contacts, but that does not mean that they are all defined within the same class. The simple rule of thumb that I follow is that if an entity requires its own table in the database then it requires its own class, and the rules for that entity are encapsulated within that class. This does not prevent the possibility of defining views which access more than one object. Thus I can have 'show me all the contacts for customer A' (which accesses customer+contact) as well as 'show me the details for contact Joe Soap' (which accesses contact only).

Then you clearly have not understood my design.
I am not the slightest bit interested in your design and haven't looked at it. I am simply responding to your comments so far. I also feel a duty to defend the community spirit in this group, otherwise I wouldn't even respond. Really your posts barely belong in the advanced forum.

So you haven't looked at my implementation, yet you keep complaining that it's wrong? Isn't that just a teensy-weensy bit arrogant?

...so if I ever want to change databases all I have to do is instantiate my DAO from a different class.
If switching DB is a possibility then you have added a flex point. You have done it in a way that breaks encapsulation (as I explained before). A Data Mapper would probably be a cleaner solution given that requirement. From what you have just said though, it sounds like your system is nearer to client-server rather than a domain driven system in any case.

My DAO does not break encapsulation (at least my interpretation of encapsulation) as explained here.


My understanding of the term 'encapsulation' comes from an OOP Tutorial - What is a Class? which states:

A class is a blueprint, or prototype, that defines the variables and the methods common to all objects of a certain kind.

This means that all the variables and methods for an entity go into a class, a single class, so that if it ever becomes necessary to amend any variable or methods for an entity then it is only necessary to amend a single class. One of the benefits of OOP is supposed to be a reduction in application maintenance due to the fact that information about an entity is encapsulated in a single class and not dispersed across multiple classes. The idea of dispersing information about an entity across multiple classes appears to be a complete reversal of the basic principles of OOP. It breaks encapsulation, and by doing so it loses any savings to be made in application maintenance. I have not seen this apparent reversal justified or explained in any way, therefore I am unwilling accept it.

Within my infrastructure the vast majority of classes exist in the business layer which acts as a buffer between the presentation layer (which handles all communication with the user) and the data access layer (which handles all communication with the database). Each of the classes in the business layer is therefore required to hold the business logic for an entity. This 'business logic' is supposed to identify how application data for that entity needs to be processed on its path between the presentation layer and the data access layer, in both directions. These 'business rules' are a mixture of meta-data and optional lines of code added by a programmer. Some of this meta-data is processed within the business object itself, while some of it may be passed to the presentation layer or data access layer where it is processed according to the requirements of that particular layer. It is accepted practice within OOP to have a class which does not have any implementation details defined within it, but to have those details supplied either through inheritance or after an object is instantiated from that class.

Although this meta-data helps the other layers carry out their required duties it should not be defined within any of these other layers for the following reasons:

7. You are using the wrong design patterns

This statement came from lazy_yogi:

Most everyone here quotes Martin Fowlers PoEAA book which is where much of the terminology comes from (Table gateway, row gateway, mapper, etc ..). The book is exceptional - IMO the best book i've read on fine grained architecture, but many sitepoint members often regard it as a bible and don't think why they are following it.

That is the problem - you are following a 'bible' like a typical religious zealot - without thinking.

This statement came from kuato:

And as for the issue of DataMappers, again it's ONE possible design pattern but not the magical one. There are many ways to skin a cat including ActiveRecords and the RowDataGateway+TableDataGateway approach that Propel takes. For some things I use hand coded DataMappers for others I use raw SQL (eww yea gross but it's fast as hell to code) and for a project I'm working on now I use Propel.

There isn't one answer to fit everybody's needs.

This suggestion came from lastcraft:

You really, really should read the (Martin Fowler) enterprise patterns books.

Because so many people here are familiar with Martin Fowler's patterns they have a tendency to categorise my infrastructure based on those patterns. This is a mistake as I have never read his book and have never (consciously, at least) sought to implement any of his patterns. When these people then complain that my infrastructure does not conform to their interpretation of these patterns they are missing a fundamental point - I am under no obligation to use the same patterns as you, therefore your complaints are without merit.

Here are some brief descriptions of the patterns which have been mentioned so far:

Domain Model A Domain Model creates a web of interconnected objects, where each object represents some meaningful individual, whether as large as a corporation or as small as a single line on an order form.
Table Module A Table Module organizes domain logic with one class per table in the data-base, and a single instance of a class contains the various procedures that will act on the data. The primary distinction with Domain Model (116) is that, if you have many orders, a Domain Model (116) will have one order object per order while a Table Module will have one object to handle all orders.
DataMapper The Data Mapper is a layer of software that separates the in-memory objects from the database. Its responsibility is to transfer data between the two and also to isolate them from each other. With Data Mapper the in-memory objects needn't know even that there's a database present; they need no SQL interface code, and certainly no knowledge of the database schema. (The database schema is always ignorant of the objects that use it.) Since it's a form of Mapper (473), Data Mapper itself is even unknown to the domain layer.
ActiveRecord An object carries both data and behavior. Much of this data is persistent and needs to be stored in a database. Active Record uses the most obvious approach, putting data access logic in the domain object. This way all people know how to read and write their data to and from the database.
RowDataGateway A Row Data Gateway gives you objects that look exactly like the record in your record structure but can be accessed with the regular mechanisms of your programming language. All details of data source access are hidden behind this interface.
TableDataGateway A Table Data Gateway holds all the SQL for accessing a single table or view: selects, inserts, updates, and deletes. Other code calls its methods for all interaction with the database.

Yet another pearl of wisdom which came from lastcraft:

Because you have never looked beyond your favourite two pattern books (whatever they are) everything looks innovative to you. There is nothing innovative in your system except where you have committed some kind of random design error. The data gateway patterns are the most primitive of all, but somehow you still managed to mix them into a confusing mess. You have used a TransformView over a TemplateView in the one and only set up where it's complete overkill. TransformView is way more infrastructure than templating and poses restrictions on the skills of the developers and designers.

There are hundreds of different books available on design patterns, which means that there are thousands of individual patterns to choose from. Some books have different names for the same pattern, or the same pattern can have several variations. Some patterns may be inapplicable in a web environment, or may be inappropriate for the application being developed. You must also bear in mind that design patterns are merely outlines and do not specify any particular implementation. They may be implemented in different languages and in different ways. Each person should therefore be free to use whatever set of patterns with which he feels comfortable, and should not be penalised just because he is not using your own personal set of favourite patterns, or not implementing them in the exact same way that you do.

This post came from lastcraft:

There are many ways to choose from, each with their own sets of advantages and disadvantages, and provided that the chosen method works and both the developers and the end users can live with the advantages and disadvantages, then all in the garden is rosy.
When combinations don't work well together then they just create extra work or problems. Your garden seems to have more than a few weeds. It might be time to break out the DDT.

Which produced this reply:

Then you obviously have not run my code otherwise you would have noticed that the different components DO work well together. I designed them that way specifically to avoid the problems I have encountered in other people's designs.

8. How easy would it be to change the database?

This point was raised by lastcraft:

The DataAccessor pattern buries the choice of data access inside the domain objects. If you ever want to change the DB, although you won't usually, then you are in trouble.

That may be a problem for you, but it is not a problem with my infrastructure. This was detailed in my reply:

Then you clearly have not understood my design. I have a separate class for each entity which contains the structure of the associated database table as meta-data. None of these classes communicates with the database directly, this is all done through a separate data access object. Note that I do not have a separate DAO for each database table, I have a single DAO for the entire application. When the business object wishes to communicate with the database, such as adding or updating a record, it passes two pieces of information to the DAO - the user data and the meta-data. It is up to the DAO to then construct the relevant query string then call the API for that database. Also note that I have a separate version of the DAO for each different database engine, so if I ever want to change databases all I have to do is instantiate my DAO from a different class.

Note that the move to MySQL version 4.1 and above is the same as switching databases as the database APIs are different.

This is one of the benefits of the 3-tier architecture - being able to change the data access layer (i.e. switch from one DBMS to another) without having any effect on the business or presentation layers.

9. There is a dependency between your db schema and the presentation layer

This post came from Version0-00e:

If it comes to changing the structure of a table by adding or removing fields, then I agree that I have to make changes in my presentation and business layers, but such changes are extremely small being limited to adding or removing a field name from a simple list.
This is ridiculous. Doesn't really matter how much of a change you'd need to make, if you had to change your database schema, and then have a dependency in your presentation which'd have to be changed as well, this is breaking encapsulation as well.

There is no clear separation of concerns there. Period, regardless of how much or little you have to change. Also complicates the maintenance of the presentation in my view?

It is quite clear from your statement that you do not understand what the term "dependency" actually means. You can only say that "module A is dependent on module B" when there is a subroutine call from A to B. The presentation layer calls the business layer, therefore the presentation layer is dependent on the business layer. The business layer calls the data access layer, therefore the business layer is dependent on the data access layer. The data access layer does NOT call the presentation layer, therefore there is NO dependency between the data access layer and the presentation layer.

If I change my database schema to add or remove a column then I also have to make corresponding changes to various components within my infrastructure:

The fact that I have to make such changes is not a weakness of my design, it is standard operating practice in every computer system under the sun. Rather than making maintenance more complicated, I regard my approach as being less complicated than any other that I have seen in my several decades in this business.

This is NOT a weakness in my design, it is a fact of life. You cannot write a business object that does not have knowledge of the data it has to work with, nor can you write a presentation layer that does not have knowledge of which fields to place where, and with what HTML control.
If it's a fact of life, then how come I can alter a database schema and in no way effect the template(s) I use? If I had to alter the database schema, only other thing that needs to be altered would be the models, ie
class User {
public $id;
public $date;
public $username;
public $password;

public function __construct() {
}
public function get() {
}
public function insert() {
}
public function update() {
}
public function delete() {
}
...
public function get_param( $param ) {
return $this -> $param;
}
}
So I need to change the properties of this class, and maybe alter the class method 'get_param()' but the presentation need not change.

A rule I use is that if a layer directly interacts with the presentation then either use either getters (as I have in above script) or use a helper class instead. Other than that, if the layer is lower down the hierarchy then use properties directly.

If you add or remove a column from the database, and then want to add or remove that column from a screen, how can you do so WITHOUT amending that screen's definition in the presentation layer?

If you add a new getter and setter to the business object, then how can you do so WITHOUT amending the presentation layer which is the ONLY place from where the new getter and setter can be called?

I should point out here that if I change a table's structure in any way I do *NOT* have to amend any class properties or methods. I simply use my data dictionary to regenerate the table's structure file. Is it that easy in your code?

To this point:

If it's a fact of life, then how come I can alter a database schema and in no way effect the template(s) I use?

I raised the following question:

I can change the database without touching any of my XSL templates, but I still have to amend the list of fields which I want that template to display.

How do you instruct your templating system on which fields to display, what size, in which order, what field labels to use, what controls to use? Or does it decide for itself, by magic?

I made this statement:

I have worked with several languages over the past 20 years that have used data dictionaries (aka application models) where the dictionary is a representation of the physical database (i.e. meta-data). It is imperative that the dictionary and the database are kept synchronised - if you make changes to one without making corresponding changes to the other the software will fail.

To which lastcraft issued this response:

That's not the issue. The issue is letting this information bleed to higher layers.

This concept of "information bleeding to a higher layer" is pure fabrication on your part and therefore not an issue at all. Information is data, and data is supposed to flow between layers in order to be processed by the logic (program code) within each layer:

What I have done is to place this meta-data within each business object, so if I change the physical structure of the database I must also change the dictionary (meta-data).
It's called coupling, which is poor encapsulation.

Coupling has nothing to do with encapsulation. Coupling describes how modules interact, while encapsulation describes what properties and methods they contain. The two terms are unrelated.

Then you have obviously not used any dictionary-based systems otherwise you would know that keeping the dictionary synchronised with the physical database is of paramount importance.
They work just fine as they are, and they would not work any better if I changed them, so why change them?
(Self censored. Even I admit I went too far.)

I take it then that your 'advice', if I can call it that, is not to make my software better, but to make it look more like your software which is not better, just more complicated. What makes you think that I want to pollute perfectly good software in such a way?

dagfinn then came up with this highly significant quote from Martin Fowler:

Layers encapsulate some, but not all, things well. As a result you sometimes get cascading changes. The classic example of this in a layered enterprise application is adding a field that needs to display on the UI, must be in the database, and thus must be added to every layer in between.

So if your hero says that, how can you possibly disagree?

10. You are arrogant and unwilling to learn

This item came from lastcraft:

You are fond of emphatic statements, presumably to sound authoritative, so at the risk of being bounced by the moderators I'll make one. You're bullying tactics may work on the much broader Usenet groups, but Sitepoint is different. I know of no member of this forum that is not willing to learn and the majority (like myself I hope) are doing their best on that journey. As a result there is a rare degree of skill and understanding here and no one gets an easy ride. You expect people to painstakingly read your first attempts at OO design when you yourself cannot even be bothered to read the standard texts? You claim you are not arrogant?

It's like some local thug suddenly gaining entry to criminal gang and finding themselves out of their depth. You bravado just sounds shrill and desperate.

Where do I begin with this collection if accusations?

Also from lastcraft:

If you are completely ignorant of alternate solutions (and you have a pretty minimal understanding of OO as far as I can make out, never mind enterprise patterns) then how would you ever know?
I am not ignorant of alternative solutions, it is just that I have seen too many of them which do not 'smell' right and which ultimately end up in failed or second-rate projects. I have seen what happens when a team of so-called 'experts' tries to build a system using all the OO 'best practices' and design patterns they can think of, and the result was not pretty. They spent 3 man-years building an infrastructure that did not work and which caused the client to cancel a multi-million pound contract. I turned round and built an infrastructure which satisfied the client's requirements in 2 man-weeks. So forgive me if I tend to take the advice of 'experts' with a pinch of salt.
It's pretty unpleasant, although mercifully rare, to come across someone who claims to be right all the time on scanty knowledge alone. To find someone who is proud of that ignorance, and refuses to learn even when things are tediously explained to them or further reading pointed out, is astonishing.
I have never said that my method is the 'right' way any more that I have said that your method is the 'wrong' way. I have merely said that I have found a 'different' way. You have not commented on the fact that my way works, or how easy it is to create and amend transactions, just that it is 'different', and because it is not 'your' way then it must be the 'wrong' way.

If someone is capable of making a constructive suggestion on how I can improve my infrastructure, or make it more efficient, or improve it in some other way then I will be more than happy to listen. But all the while you restrict your comments to 'you must not do it *that* way, you must do it *this* way' I will not listen. If I did things your way then I would be no better than you, and I am not convinced that 'your' way is the 'best' way. In an attempt to find something which is 'better' I have to start with something which is 'different', and it is this difference which seems to upset you.

This also came from lastcraft:

I am vilified for daring to be different.
You haven't been vilified for your project at all. You have been rightly vilified for inflated claims and an insecure attitude.

I replied with this post:

I have never claimed anything that cannot be justified. I have developed an infrastructure that works, I have produced volumes of documentation, and I have made a sample available on my website which can either be run online or downloaded. I have never claimed that it can do anything which it does not actually do. I have never claimed that it is pure OO, just that it contains some components which are OO. I have never claimed that my design is the only 'true' design, nor have I claimed that my methods are the only 'true' methods. All that I have claimed is that they are 'different'. It is 'difference' which is the parent of innovation.

All your arguments seem to be based on the fact that my methods are different to yours, not that my results are any better (or worse) than yours. I am a pragmatist in that my primary concern is the result, and I will use or discard any methodology as I see fit in order to achieve that result. I will use or discard any set of arbitrary rules as I see fit in order to achieve that result. Where a definition is open to interpretation I will use whatever interpretation I see fit in order to achieve that result.

You, on the other hand, fit the classic description of a dogmatist - you think that following an arbitrary set of rules to the letter and without question is more important than obtaining the best result.

You also think that your interpretation of those rules is the only true interpretation, therefore all other interpretations are automatically invalid.

I do not have an insecure attitude. I am secure in the knowledge that my infrastructures (note the plural as I have built 3 different infrastructures in 3 different languages) have allowed a higher level of programmer productivity than every other infrastructure I have encountered in the past 20 years.

Version0-00e came up with this post:

Countless number of people on this forum have explained in any number of ways of how and why you could (and probably should) implement a better design and development process, for example your class(es).

All you have done is explain that I should change both my design and implementation so that it is more like your design and your implementation. I am not going to change anything until I am convinced that there will be tangible benefits. Unfortunately some of your suggestions - such as using more and smaller classes - I have already encountered in the past with disastrous results. If an idea has such a detrimental effect on programmer productivity that it causes the entire project to be cancelled then you should appreciate my extreme reluctance in wishing to repeat that experience.

11. Your code smells and needs to be refactored

Brenden Vickery came up with this little gem:

I am not confusing the issue with subclasses. I am merely saying that I have a single User class, not a User and a UserMapper class.
By "encapsulating" everything in a single "User" class you have completely distorted the DAO pattern.

How so? All the information about a User is encapsulated with the User class (which exists in the business layer). None of it is held within the DAO (which exists within the data access layer). A User object does not communicate directly with the database, instead it communicates with the DAO which communicates with the database. How could this possibly be a distortion?

That is because I believe in the KISS principle. I do not like to make anything more complicated that it needs to be.
Here is your interface for a Person in your example application. This is not code writing by someone who believes in "Keeping it Simple Stupid" nor does it take into account massive amounts of Object Oriented Principles.

There is some serious refactoring to do.

(He then listed out the 9 functions within my PERSON class and 86 functions within my abstract table class).

Take a look at the common functions. Everything to do with meta-data about an entity, put in some sort of TableMetaData object. Everything to do with insert/update validation put in a Validation object. Pass the validation object the TableMetaData object and the data to be validated. Everything to do with querying the database(setOrderBy(), setWhere() etc..) put in a query object and pass that object to the DAO.

This does not smell right to me. In the first place splitting information about an object across multiple classes breaks encapsulation. In the second place it requires more code as you now need this collection of classes to communicate with one another. This sounds like added complexity which would make the programmer's job more difficult. I have actually worked on an implementation of the N-tier architecture where the system architects chose to implement 10 tiers along the lines you suggest, and it was a total disaster. I refactored down to just 3 tiers and made it work. That is why I have a business layer object and a data access object. I do not need a separate TableMetaData object or a Query object. So my experience tells me this:

You will forgive me if I choose to trust my personal experience more than your opinion.

Having 100 or so functions in a class is a serious bad code smell. Doing these things and refactoring will drop the amount of code in your architecture by a ton.

To which I replied:

Really? I've never seen anything written in the principles of OOP that says a class must not have more than 'n' methods or 'n' properties, so as far as I am concerned that rule does not exist and so I choose to ignore it. It is an artificial rule designed by someone who likes to add complexity to the problem.

The definition of encapsulation is quite clear - all the properties and methods for an object must go into a single class.

And again here:

Surely it will not reduce the amount of code, but simply move it from one large class to a series of smaller classes. But then I will have to have extra code to instantiate and communicate with these extra classes, so the end result will not be LESS code but MORE code, and more complex code at that. I think I shall ignore your advice and stick to the KISS principle.
You are arguing just because I dare to disagree with you. Just because I choose to read from a different design 'bible' you seem to think I am a heretic. I do not have DataAccessor or DataMapper classes for the simple reason that I do not follow that particular design methodology. I use the 3 tier architecture in which I have components in the presentation layer, the business layer, and the data access layer.
You are definitely on the right track with everything in your architecture but there are blatant problems that could be addressed.

The 3 tiers in your architecture are mixed up as far as I can see.

Your Presentation is tied to your database through column names, and form names. You couldn't change your database without changing your presentation. You cant change your presentation without changing your database. I find the way you have done this to be extremely difficult to use.

No, my presentation layer is NOT tied to the database.

You have actual sql in your controller that is passed into your "DAO". This means you can't change your database without changing your controller.

The purpose of a data access layer is to "encapsulate" communication with the data source. Again, having sql in your controller screws this up.

I do not execute any sql statements in my controller, so I do not have any data access logic in my controller. However, I may have fragments of sql in data variables, but this is data, and data is not logic. It may comes as a shock to you, but I am allowed to pass data variables between the different layers, so this does not break encapsulation. If you bothered to read the principles of OOP from a competent source you would realise that "encapsulation" deals with methods and properties, not communication. In my framework various pieces of data are passed to the DAO, and it is up to the DAO to assemble these pieces into a complete sql statement before it is executed. As I have a different DAO for each database engine each DAO can assemble these fragments in a manner which is appropriate for that particular database engine.

This post came from AWilliams:

Really? I've never seen anything written in the principles of OOP that says a class must not have more than 'n' methods or 'n' properties
Quite right, there is no written rule as such - it's completely subjective. But when a class reaches the size of the one 'Brenden Vickery' posted, I begin to think something is going horribly wrong. I'm only learning, myself, so I won't pretend to be any sort of authority on OOP, but perhaps this link will be of some use: http://c2.com/cgi/wiki?CodeSmell

Having read one of your earlier posts, I also feel you've misunderstood what the purpose of a single User class is. A User object should represent one user, no more, no less, and should encapsulate the properties and responsibilities of the real-life object. That is to say, I (in real-life) don't have the responsibility of storing information about myself in a database - as such, a User object should be completely unaware of any database interaction. This is the One Responsibility Rule. Any code which stores or retrieves information about the user should ideally be held in another class (described as a Mapper in this thread) because that is a completely separate responsibility.

User <-> UserMapper ( <-> DAO ) <-> Database

As I said, I am myself only learning, so correct me if I've misunderstood what I've read elsewhere.

The only difference between your structure and mine is that I do not have a separate UserMapper class (remember that its usage is optional, not obligatory) as the relevant information is defined within my User class as meta-data. In all other respects it is the same as whenever the User object requires any database access it sends a request to the DAO. How the DAO deals with the request is completely hidden from the User object.

I disagree that my User object should be limited to a single user. There is no such rule, therefore I don't have to obey it. Besides, have you not come across Martin Fowler's Table Module which is allowed to deal with multiple database rows?

Similarly if any User information needs to be presented to the user, such as in an HTML document, a PDF document, or whatever, then that information is extracted from the User object and given to the presentation layer component which deals with HTML, PDF or whatever. How the presentation layer component deals with the request is completely hidden from the User object. This is the separation of responsibilities that is defined for the 3-Tier architecture, and my infrastructure demonstrates that degree of separation.

This post came from Version0-00e:

... the Presentation layer shouldn't need to know of what fields to use from the database. Looking last night on Tony's site, I seen in XML he passes over the database field names (for whatever reason).

This isn't real separation of concerns surely? A XSL stylesheet doesn't need to know this, all that it's interested in is getting the data from the XML and parsing it dependent on a given template, nothing more.

I don't know how you write your programs, but in the last 20+ years when I have been building screens I don't just throw random bits of data at the screen, each piece of data has an id (name) as well as a value. Unless you want to make extra work for yourself it is extremely common practice to give each field on the screen the same name as is used in the database. This tends to avoid a great deal of confusion, and completely does away for the need for any data mappers.

When it comes to these modern inventions called XML and XSL you may be surprised to hear that the information within an XML file is comprised of elements where each element has a name and a value. In order to transform XML into HTML the XSL stylesheet has to be told which element to put where, and what does it use? Why, the element names of course.

Every layer within the system - the presentation layer, the business layer and the data access layer - has to know the identity of the piece of data it is dealing with, so why on earth are you complaining that I am using the same names in all 3 layers? Just what is the problem?

Version0-00e replied with this post:

What on earth is wrong with that?
Nothing as such some may think but it does mean having to change the XML file to suit the database schema

I can change the database schema without changing any code which constructs the XML file for the simple reason that I have a single piece of generic code which writes the contents of an associative array into the current XML document. This generic code does not contain any hard-coded field names as it uses whatever names are supplied in the associative array. This array is either provided by the DAO as a result of executing an sql SELECT, or provided by the user via an HTTP GET/POST request. Your criticism is therefore without any basis in fact.

I don't know where you get these ideas from, but the order in which data is written to the XML file has got nothing to do with the order in which it is defined within the database schema. The order is irrelevant as every piece of data is identified by name, not its sequence number. Within the XSL stylesheet each element is extracted by name, so the order in which the elements appear within the XML file is totally irrelevant. This means that I can change the ordering of fields in any screen without changing the order in the database, and vice versa. The one is most definitely NOT inextricably tied to the other.

The stylesheet in question shouldn't need to know what the data is or what format it is, nor what order to display it as, it just transforms the data nothing more, yes?

But surely the stylesheet must know in which order the elements are to be displayed? What type of control to use? What size of field to use? It must have some knowledge of what it is supposed to do, surely?

Selkirk came up with these observations:

Your code has helpful comments:
// initialise session 
initSession(); 

Is that any worse than these which I found in the creole package (a component of propel)?

    /**
     * Begins a transaction (if supported).
     *
     */
    public function begin();
    
    /**
     * Commits statements in a transaction.
     *
     */
    public function commit();
    
    /**
     * Rollback changes in a transaction.
     *
     */
    public function rollback();
as well as:
} // if  

It may come as a surprise to you, but I am used to working in languages where each control structure is terminated with a corresponding ENDIF, ENDWHILE, ENDFOR, ENDwhatever statement, so I do not like seeing an anonymous closing brace on its own without some sort of indication as to what control structure is actually being closed. Do you have a problem with that?

This post came from Brenden Vickery:

Here's an excerpt of std.table.class.php. There's a lot of comments but I can't quite figure this code out. Could you explain this? It might be another code smell but I havent read any OO tutorials that say code like this is bad practice so Ill just ignore my instincts for now and chalk it up to another arbitrary rule and ignore any problems I think are here.
    function _cm_commonValidation ($fieldarray, $originaldata) 
    // perform validation that is common to INSERT and UPDATE. 
    { 
        return $fieldarray; 
         
    } // _cm_commonValidation 
     
    // **************************************************************************** 
    function _cm_formatData ($fieldarray) 
    // perform custom formatting before values are shown to the user. 
    { 
        return $fieldarray; 
         
    } // _cm_formatData 
     
    // **************************************************************************** 

... contents of 20 similar methods not repeated here

For your information those methods are defined within the superclass but do not actually do anything. There are there as empty stubs waiting to be overridden by custom code in any subclass. They are, in effect, abstract methods. For an idea how they are designed to work take a look at some UML diagrams which I have prepared.

And here is an interesting opinion from lastcraft:

It's the biggest pile of horse manure I have had the misfortune to come across for some time. To describe 200+K of code as a KISS way of displaying a few tables is jaw dropping. Configuration of each class is by uncommenting in the middle of a class file would you believe. Definitely a write once and I'll never be out of a job piece of "software". I felt I had to go and wash my hands. And also wash the underside of my chin as it spent most of the time resting on my desk. POOP doesn't begin to describe it. It makes PHPNuke look exemplary. I cannot imagine the shame of leaving a mess like this behind for someone to clear up, never mind calling it an "application". To make a small sample application, designed to illustrate the system, into such an entangled mess takes an extraordinary comedy talent.

No, please. Don't hold back. Tell me what you REALLY think.

I have taken a look at some code which you and others in this forum have put forward as a perfect example of how it should be done (take a look here) and guess what! I am just as unimpressed with that code as you are with mine.

12. Your code is not OO, it is procedural

This remark came from Brenden Vickery:

Your code is not OO just because it demonstrates the use of encapsulation, inheritance and polymorphism. Almost your entire architecture uses procedural functions (which isn't always a bad thing) and require_once style forwarding to abstracted "Transaction Controllers" that are procedural if/else scripts.

This remark also came from Brenden Vickery:

Putting "class Default_Table {" before a list of procedural functions and "}" after the list does not make the code Object Oriented.

Here is my reply:

I beg to differ. If I write code that uses the OO functions within PHP, and that code follows the basic principles of OOP and exhibits the properties of encapsulation, inheritance and polymorphism, then by its very nature that code is OO.

That may not be the proper way according to your rules, but it works! In my humble opinion it is far better to have something which breaks some arbitrary set of rules but which works than it is to have something which obeys those rules but which fails to work.

This post came from DougBTX:

If the distinction between code that simply has classes and code that is object orientated is arbitrary to you, then you might want to take a second look at it.

If you put pure procedural code into an object, it is still procedural code. It can't be orientated around objects if it is just procedural code modularized using classes.

I replied with this:

OOP is nothing more than creating classes which have methods and properties. By creating a different class for each object you demonstrate encapsulation. By sharing common code through subclassing you demonstrate inheritance. By having common method names on different classes you demonstrate polymorphism. My code contains all those features, therefore it is OO. It may not be your flavour of OO, but it is OO all the same.

If not there are plenty of OO tutorials on the web which are also wrong. I have seen several which show how to take some procedural code and do it 'the OO way', and guess what? The code looks more or less the same, but simply enclosed in "class whatever{}".

At long last a supporting view from Captain Proton:

If the distinction between code that simply has classes and code that is object orientated is arbitrary to you, then you might want to take a second look at it.
Ok, you tell me: what exactly is the difference? By definition there is no difference: object oriented code is 'code that has classes' to put it loosely. I know what you're talking about though, in fact I have the same critique on a lot of code I see, but it is a subjective matter. I want to see an objective definition of OO that says I cannot just stick a bunch of functions into a class because it is not object oriented code any more.

Secondly, give me a definition of object oriented programming which requires the code to be more than just consisting of classes. Not your definition, but *the* definition (rhetorical question: is there any book, internet site, or whatever that gives such a good definition of what OOP is, instead of just some person's interpretation of what it should be?).

If you put pure procedural code into an object, it is still procedural code.

By definition every function is procedural. Object methods are not any different from regular functions (*procedures*) with a hidden $this parameter. So, by what you are saying, if I stuff *any* function into a class, it is no longer object oriented? You might want to think about exactly what it is you are saying before you make such a statement.

Here is one definition [of encapsulation]...

One definition? So there is more than one definition?? So encapsulation can be something totally different, depending on what definition you decide to follow??? Ohh what a great concept encapsulation must be if it can mean anything to anyone!....

Precision of concepts is important, people! Please try to realise this! The very reason we are having this heated discussion is because 'our' whole foundation is imprecise, vague and ambiguous!! OOP has no clearly defined concepts and therefore anybody can and is free to interpret the interpretations of other people in a way that most suits their needs. Is that a bad thing, freedom of interpretation? Yes! In an exact science like computer science, which is what PHP and OOP are founded on (right?), precision is important!

Also, consider this (note that I said 'consider', which means 'think about it' and is not the same as 'I want you to accept this'): if OOP leaves open so many questions and has so many subjective sides to it, perhaps the model is just a bad model? It may have good sides (believe me, it does) but there are bad sides to it - which I hope I have shed some light on in this post..

The M in MVC stands for "model". If they meant "database" they would have called it DVC.

The precision-of-definition argument is valid too, here. What is the definition of model? What is and what is not a model? Again this is a subjective thing. And subjective means that it is ambiguous and therefore the very cause of the confusion that lays beneath this discussion!

My code contains all those features, therefore it is OO. It may not be your flavour of OO, but it is OO all the same.

That's what I am trying to explain in this post: that because OOP has no objective definition, it is ambiguous and that will lead to several 'flavours' of it. Lastcraft's flavour, which is perhaps the general consensual flavour on this forum, may not be the same as Tony's, but Tony's code is no less OOP. Note that I am in no way saying that either one is better or they are equally good.

If not there are plenty of OO tutorials on the web which are also wrong.

Another critique on the subjective nature of OOP: each of those tutorials presents the interpretation of vague concepts by some author, a possible 'flavour' of OOP if you will. This is the cause of the confusion again.

Encapsulation is a language construct that facilitates the bundling of data with the methods operating on that data.

Great! "Encapsulation is a language construct", so that means encapsulation is something like if, else, foreach and while, right? Well since there is no such thing as "encapsulate { $var }" in any language, I assume the language construct intended here is the "class { }" construct, right? So according to this definition that means that *every* class is encapsulated! Do you see what a lousy definition this is?! The very first sentence of the so-called 'definition' is just plain wrong!

But of course if you view 'language construct' as a more broader concept, you can see just exactly what you want in this definition of encapsulation: ambiguity -> vagueness -> confusion!

I could not have put it better myself. Well said, Captain Proton! (the cheque's in the post)

To sum up then:

That's it. End of story. Period. No 'ifs', 'ands' or 'buts'.

You cannot look at procedural code and say that's not proper procedural because the style is wrong, the implementation is wrong. I programmed in COBOL, a procedural language if ever there was one, for 16 years, which meant that every piece of code I wrote was procedural. There were many different ways of utilising the language features, of structuring the code to fit the problem, but no matter which 'style' was used the result was still procedural.

Likewise if I split my code into classes and objects it is, for that simple fact alone, object-oriented. The fact that my 'style' is different from yours is totally irrelevant. The fact that I choose to make classes of different 'things' to you is totally irrelevant. The fact that I split my application logic over a different number of classes to you is totally irrelevant.

Your entire argument is based on a false assumption and is therefore irrelevant and without merit.

13. Your system is not 3-Tier

lastcraft came up with this little gem:

This is what the 3-Tier architecture is all about, and I have implemented it correctly.
Not if there is a connection between your schema and your presentation you haven't. That is not a 3-tier system and to describe it as such is to commit malpractice.

There is *NO* connection between my schema and my presentation layer. The data access (schema) layer can *ONLY* be called from the business layer, and the business layer can *ONLY* be called from the presentation layer. The data access layer *CANNOT* be called from the presentation layer, therefore there is no connection between the two. The fact that I use the same item names in my schema, my data access layer, my business layer and my presentation layer is *NOT* a crime, it is common practice. Only an idiot would deliberately use different names as this would require extra components in the form of data mappers. I use consistent data names, which means that I do not need data mappers.

And later on with this:

The definition of tiering is that each tier can only see the one immediately below.

By "see" I assume you mean "call", which is correct. A tier can only call (send a request to) the tier immediately below it. All it can do with the tier immediately above it is return a response to a request. This is exactly what I have implemented (see Figure 1): the data access layer can *ONLY* be called from the business layer, and the business layer can *ONLY* be called from the presentation layer. So explain to me in words of one syllable exactly how this is wrong!

Figure 1 - Requests and Responses in the 3 Tier Architecture

3-tier-architecture-005 (2K)
The full architecture is sometimes called 4-tier (Presentation, Application, Domain, Infrastructure), but there are slightly different interpretations. When Application and Domain are combined, that is the traditional 3-tier. I find the 4-tier model a better diagnosis tool, although tiering (layering) is a pattern and as such you adjust it to taste.

The 3-tier model as I know it is presentation logic -> business logic -> data access logic. I do not see any benefit in breaking it up any further (indeed I have witnessed disastrous results when it was) so I choose to stick with this interpretation. I have used it successfully in two different languages, so I know its benefits over 2-tier and 1-tier systems.

Anyway, if changing the DB schema causes a change to the view you obviously don't have tiering. If changing a view (without adding a new application feature) causes anything at all to change then you don't have tiering. If changing a domain interface causes the schema to be edited even though the relationships are essentially correct, then you don't have tiering.

Certain changes to an application WILL require changes in more than one tier (layer):

Except that things aren't that simple.

In practice we take some shortcuts if they are expedient. An example is making the table names and class names synonymous (called an isomorphic mapping). This can make the architecture easier to understand on the ground and can save mapping code. If we are the only application or domain library using the DB then there is no problem with this. If a problem does arise, then you will have to put in the mapping code.

In that case I hold up my hand and admit to taking shortcuts for the sake of expediency. I also plead guilty to this isomorphic mapping thingy as to do anything else would be just plain stoooopid (in my humble opinion).

Yes you have to make it logically possible for a new piece of information to be displayed. Tiering allows us to choose the "how" of that process with complete freedom. You don't have to add a corresponding database field.

You do if it is not a derived or calculated field. If it has to be input by the user then you have to store it somewhere otherwise the user will not be able to store or retrieve any values.

In this post lastcraft made a very profound statement:

You have a client/server app., not a three tier one.

Which generated this response:

By making such a statement you have just demonstrated a complete lack of understanding of the purpose of the 3-tier architecture. Let me enlighten you: The 3-tier architecture has its application logic split across the following tiers (aka layers):

Having a separate data access layer means that you can switch from one DBMS engine to another without changing any logic in the other layers.

Having a separate presentation layer means that you can change your user interface without having to make any changes in your business or data access logic.

The 3-tier architecture was promoted in the language I used prior to PHP as it made it possible to have an application with a client-server interface and to enable it for the web simply by adding on a new HTML interface which could share all the existing business and data access logic.

In case the significance of this has escaped you, it means that a 3-tier application has exactly the same structure whether it has a client-server interface or a web interface, or even if it has both at the same time. You do NOT have one version of the 3-tier architecture for client-server and another version for the web. It is the same architecture, but with different presentation layers. As my presentation layer produces HTML output and runs off a web server, it is most definitely a web application.

lastcraft responded with this post:

3 tier is not about dividing up code. You could do that just by placing different source files into different folders on your hard drive and claim it was "3 tier". 3 tier is about severely restricting visibility across those boundaries. If you fail to do that then you don't have a 3 tier architecture. There is no room for opinion here, you simply don't understand the definition if you've not done this.

Excuse me, but 3-tier most definitely IS about dividing up code. Presentation logic goes into the presentation layer, business logic goes into the business layer, and data access logic goes into the data access layer. It is as simple as that.

And added even more with this post:

Separation is not the only requirement, but also limiting the visibility to the next lower layer. You've broken that and it has been painstakingly explained to you at least four times now. The degree of breakage is not that great, it's a fair go for a first attempt, but that breakage is still there. I suspect that you could probably clean up it pretty easily. If you tried.

To be classed as 3 tier the application is required to have the following:

Communication between these layers is limited to the following:

In other words the requests must always be in the direction front-to-middle-to-back while the responses must always be back-to-middle-to-front.

That is exactly what I have implemented, so how can it be wrong?

How have I broken the concept of "limiting visibility"? Where is this concept defined?

DougBTX decided to join the argument with this post:

You may implement the 3-tier architecture in a different way, but as long as it fulfills its purpose it would be a perfectly valid implementation, just as mine is a perfectly valid implementation.
But an implementation of what? Just because it "works" does not mean that it is an implementation of a 3-tier architecture.

Let me put this as simply as I can - if I separate my application so that the presentation logic, business logic and data access logic are in separate tiers (layers) with their own components then that is DEFINITELY, ABSOLUTELY and INCONTROVERTIBLY an implementation of the 3-tier architecture. That implementation may not be to your personal liking, but that is irrelevant.

Yet another piece of wisdom from lastcraft:

The 3-tier architecture was promoted in the language I used prior to PHP as it made it possible to have an application with a client-server interface and to enable it for the web simply by adding on a new HTML interface which could share all the existing business and data access logic.
If they had near identical navigation and screens, then you did not change the presentation layer, all you did was port details of the front end.

Being able to port once does not make it 3 tier (should be 4 tier these days by the way).

My reply was as follows:

I don't know where you get your ideas from, but changing the user interface from client-server to web most definitely IS a change in the presentation layer. One is stateful and uses a GUI interface, one is stateless and uses HTML and CSS as its interface. The two are totally different visually, and require totally different code to generate them. The fact that the two sets of screens are made to look as similar as possible is not only irrelevant, it is usually a user requirement as they do not want massive amounts of retraining in order to switch from one set of screens to the other.

Another important factor is that it is (or should be) possible to run both user interfaces at the same time using the same business and data layers.

14. Here is an example of how it should be done

My critics were very quick to tell me that my infrastructure was total crap and in desperate need of refactoring, but none of them was willing to provide any concrete examples which could prove that their ideas would provide better results than mine. Note that I am not going to spend any of my time refactoring my code just to do it a 'different' way. Unless that effort produces tangible results it is wasted effort.

First, here is a description of how you can install and run my sample application:

Nothing else needs to be done in order to run all the forms which access the sample database. This contains seven tables in a mixture of one-to-many and many-to-many relationships. There is even a table which is related to itself.

Someone then suggested that I take a look at the propel package as a shining example of how things should be done, so I did.

Here is a description of how you can install and run propel:

Unlike my offering which is a complete end-to-end solution the propel package is nothing more than the data access layer, which means that it is two-thirds missing. You cannot actually run anything until you build it yourself, and there are no runnable samples to show you how it is supposed to be done.

Not only that, it will only run with PHP 5.

I did have a brief look at the documentation and some of the code, but I did not like what I saw, so I gave up.

Is this really the best you can do?


Conclusion

It is quite clear to me that object oriented programming is anything but simple. What started life with a few simple principles - that of encapsulation, inheritance and polymorphism - has quickly grown into a multi-headed beast which means something totally different to different people. The original definitions have been redefined, re-interpreted, expanded and mutated beyond all recognition. If you don't believe me then consider the following:

Almost every piece of terminology used in OO seems to mean different things to different people. Some of the arguments which caused this article were:

All the while these simple questions do not have simple answers which are universally accepted then it will be impossible to produce software which satisfies everybody. No matter how hard you try someone somewhere will always find a reason to denigrate your efforts. If you use method 'A' then to the followers of method 'A' you are a hero while to the followers of method 'B' you are a heretic. If you switch to method 'B' then the followers of method 'A' will call you a heretic.

So what is the poor programmer to do? If it is impossible to please every group of IT professionals, then which group should you aim to please? Answer: none of them. The only people you REALLY need to please are as follows:

  1. The paying customer.
  2. Yourself.
  3. The rest of the development team.

The first should be easy to justify. If he is not pleased with what you create he is unlikely to buy it. If he won't buy it then you will soon be out of a job.

The second is something which a lot of people fail to understand. If a developer is not pleased with what he has developed, if he cannot take pride in it, if he is unable to say 'I have done my best, I cannot do better', then it is liable to be a piece of second-class material which will come back to haunt him. I've been there, so I know what I'm talking about.

The third should be obvious. If the development team cannot get to grips with the development environment then they will not be able to do what they are supposed to do, which is to generate the components which satisfy the user's requirements. Anything which helps to boost programmer productivity is therefore a good thing. Anything which hinders programmer productivity is therefore a bad thing.

I draw your attention to the following post made by Selkirk:

Circa 1996, I was asked to analyze the development processes of two different development teams.

Team A's project had a half a million lines of code, 500 tables, and over a dozen programmers. Team B's project was roughly 1/6 the size.

Over the course of several months, management noticed that team A was roughly twice as productive as team B. One would think that the smaller team would be more productive.

I spent several months analyzing the code from both projects, working on both projects and interviewing programmers. Finally I did an exercise which lead to an epiphany. I counted each line of code in both applications and assigned them to one of a half a dozen categories: Business logic, glue code, user interface code, database code, etc.

If one considers that in these categories, only the business logic code had any real value to the company. It turned out that Team A was spending more time writing the code that added value, while team B was spending more time gluing things together.

In the 70s an AI researcher named Doug Lenat wrote a program called AM that could discover mathematical proofs. One of the arguments against this actually being an example of computer creativity is that AM used a very rich notational language in his problem domain (mathematics). Even generating random symbols in this notation made it hard not to come up with a proof of some kind.

Team A had a set of libraries which was suited to the task which they were performing. Team B had a set of Much more powerful and much more general purpose libraries.

So, Team A was more productive because the vocabulary that their tools provided spoke "Their problem domain," while team B was always translating. In addition, team A had several patterns and conventions for doing common tasks, while Team B left things up to the individual programmers, so there was much more variation. (especially because their powerful library had so many different ways to do everything.)

This is where the design of the development infrastructure plays an extremely important part. If productivity is measured by the amount of high-value code (business logic) which is written, then the infrastructure should minimise the need to write any low-value code (glue code, interface code, database code, etc). If you care to examine my own infrastructure you should observe that the bulk of the programmer's valuable time is spent on writing code for the business objects. Apart from writing very small component scripts and screen structure scripts everything else is standard and can be used 'out of the box'.

It is also my opinion (and this is where I appear to be in a very small minority) that more time should be spent on writing code which provides practical solutions to the customer's problems and less time should be spent on writing code that satisfies an arbitrary set of academic criteria, especially when that academic criteria seems to undergo constant re-definition, re-invention, re-evaluation and re-interpretation.

What are the benefits of OO Programming?

Advocates of OOP often make brash claims as to why developers should switch from 'old-fashioned' procedural programming to 'new-fangled' object oriented programming. Among these claims are:

Yet there are some in the world of OOP who cannot achieve faster development than with other paradigms. As an example I draw your attention to this post by lastcraft in which he says:

I find that OO is best as a long term investment. This falls into my manager's bad news (which I have shamelessly stolen from others at various times) when changing to OO...
1) Will OO make writing my program easier? No.
2) Will OO make my program run faster? No.
3) Will OO make my program shorter? No.
4) Will OO make my program cheaper? No.
The good news is that the answers are yes when you come to rewrite it!

So in his 'expert opinion' using OO techniques to write an application will NOT give you cheaper software or shorter delivery times, and it is not expected to yield any benefit until you come to rewrite it at some far distant point in the future.

I think that such an attitude is indicative of a lack of ability in the programming department. I have produced four versions of the same application in the past years where each rewrite was to take advantage of a change in technology:

With each rewrite I first concentrated on rebuilding my basic infrastructure to work in the new environment, then I ported each of the transactions in my sample application. Without exception I found that I could develop individual components at a faster rate than I could in the previous implementation, so when someone says that it is not reasonable to expect faster development times using OOP I just have to question that person's ability as a programmer. And he has the nerve to tell me that my approach is wrong! Excuse me while I collapse in fits of laughter.

It is a sad fact of life that only a very small number of programmers have the ability to create an application infrastructure that truly supports Rapid Application Development (RAD). They simply assume that all they have to do is follow a set of rules and everything in the garden will be rosy. They fail to realise that it is possible for certain rules to actually have a detrimental effect on a programmer's productivity. I have spent a great deal of my career in experimenting with various techniques and methods, and I will only use those which have proved their worth in a live situation. I deal in practical, not theoretical solutions, so I will accept or reject any rule, technique or methodology as I see fit. I will judge each rule, method or technique on its ability to help or hinder a programmer to be as productive as possible, and I will reject those which fail to measure up to my high standards. If that means that I end up rejecting something which others consider to be a 'golden rule' or the 'gospel truth' then so be it.

Very few programmers appear to have the ability to make such value judgements, and are happy to let other people make their decisions for them. Yet if they follow second-rate rules without question, how can they possibly expect to be anything more than second-rate themselves? Unfortunately inferiority can only be recognised in comparison with something which is superior, and few of today's programmers have exposure to different methods with which such a comparison can be made. In order for something to be superior it has to be different in some way, yet too many people appear willing to reject a new idea simply because it IS different. 'It is different from what I have been taught,' they say, 'therefore it must be wrong'.

Some people accuse me of arrogance because I dare to question the wisdom of those who claim to be my 'elders and betters', but are not the same people guilty of arrogance for assuming that their way is 'the right way, the only way'? If I can produce better results by using a different method then why should I lower the quality of my work by following their inferior methods? After all, the 'R' in RAD stands for 'rapid' not 'retarded'.

Programmer Productivity takes Preference over Paradigm Purity

All this talk about who's programming style is best or worst, correct or incorrect, pure or impure, proper or improper, acceptable or unacceptable is largely irrelevant. The primary purpose behind software development is not to please other developers with how closely you can follow an arbitrary set of rules, or how many levels of abstraction you can invent, or how many different design patterns you can implement, or how many buzzwords you can use to describe it, but to produce something that pleases the paying customer. When a potential customer considers a particular software solution there are only three basic questions in his mind:

You should notice that neither 'programming style' nor 'paradigm purity' appear in that list (nor should they, IMHO).

Regardless of what type of application is being developed, in what language and for which platform the final product will always consist of at least two things:

You cannot simply get a team of programmers to generate components and expect them to fit together like magic. You must have some sort of framework or infrastructure in place to hold them all together. Just like the human body has a collection of components (the organs), without a proper framework (the skeleton) we would be nothing more than a mass of jelly flopping around on the ground.

In the past 20 years I have personally developed 3 different infrastructures in 3 different languages, and they all had the following attributes:

The primary purpose of these infrastructures has always been to provide the programming team with a development environment which enables them to produce effective components as quickly as possible. Speed of development has a direct impact on both the budget and the timescales, so the faster you can turn out components the lower the costs and the earlier the delivery date. When you work for a software house (as I have done for most of my career) and have to bid against the competition then lower costs and shorter timescales are considered to be so important that they are actively encouraged at every opportunity. Programmer productivity is more important that paradigm purity.

I have always developed in a style which produces the best balance of quality, productivity and maintainability, so anything which is found to speed up the development process gets the thumbs up while anything which obstructs it gets the thumbs down. This may mean that I occasionally choose to discard someone's favourite methodology or design strategy, but when the responsibilities for the project are mine the decisions are also mine.

I said earlier that the programming style should not be a factor in the choice of software solution. Others may disagree, so to demonstrate my point I invite you to consider the following example in which 3 different teams propose their own solution to the same customer's problem. Each team develops using its preferred style, its own definition of 'best practice'.

Assume that the application contains 100 components and that the charge rate is £10 per hour:

In case you think I've pulled those times out of thin air, let me assure you that I have personal experience of development environments in which those figures were applicable. In fact the worst one had development times that were 2 weeks per component, not 1. It has also been my experience that efforts to make the infrastructure more 'technically pure' have a negative effect on development times.

In the customer's eye the most important aspect of each bid, the 'bottom line', is as follows:

Team A: 100 hours at £10 per hour£1,000.00
Team B: 100 days at £10 per hour x 7£7,000.00
Team C: 100 weeks at £10 per hour x 7 x 5£35,000.00

With my experience of dealing with customers I think it is fair to say that the last questions they are liable to ask would be:

Most customers simply do not care about how the software is written, about what it looks like 'under the hood'. Even if they did care they would have a very hard time convincing the board of directors why they should accept anything other than the lowest bid.

So in the real world you see that 'paradigm purity' does not have a high enough ranking in the customer's eyes, so you would be well advised to concentrate your efforts on those areas which impress the customer instead of those which impress no-one but the 'paradigm police'. There is a saying in the culinary world which goes: The proof of the pudding is in the eating, which means that the recipe and the ingredients are irrelevant if the end product tastes like crap does not meet the customer's expectations. It is the results that count, not the methodology. Programmer productivity takes preference over paradigm purity.

Here endeth the lesson. Don't applaud, just throw money.


© Tony Marston
25th November 2004

http://www.tonymarston.net
http://www.radicore.org

counter