Chapter 5: PHP and Web Distributed Data eXchange (WDDX)"Beam me up, Scotty!"
The preceding chapters have focused primarily on parsing XML documents with a strong emphasis on producing content for web browsers. Although accomplishing this is no mean featin fact, it's one of the most popular ways to use the XML/PHP comboit's just the tip of the XML iceberg. You'll remember from my opening remarks that XML provides constructs to encode any type of information in a standard, machine-readable format. This makes XML the ideal vehicle for information exchange over the web. All that's needed is an encoding format that is understandable to both sender and receiver and that can piggyback over standard Internet protocols (HTTP, SMTP, FTP, and so on). That's where the Web Distributed Data eXchange (WDDX) comes in. WDDX provides a standard format for creating XML-based data structures designed for easy transmission across the Internet. These WDDX data structures are largely platform-independent, and they can be decoded and used by any application that understands the WDDX format. Over the next few pages, I will be examining WDDX in greater detail, demonstrating how it can be combined with PHP to encode and exchange data across different systems and platforms. This chapter marks the transition from merely parsing XML data to actually using XML as the vehicle for other applications. In addition to detailed descriptions of how PHP can be used to create WDDX structures, I'll also be demonstrating some real-life applications of the technology to illustrate its usefulness and versatility. WDDXInvented by Allaire Corp. (makers of the HomeSite HTML editor and the ColdFusion application development environment), WDDX is " . . . an XML-based technology that enables the exchange of complex data between web programming languages . . . 1 It was created in 1998 as an open standard designed specifically to simplify data exchange across different platforms, and it has quickly gained popularity with web developers for its elegance and ease of use. WDDX works by converting language-specific data structures into their corresponding XML representations. These XML data structures are text-based, platform-independent entities, and, as such, can be transmitted between different systems over standard HTTP protocols with minimal difficulty. Any WDDX-friendly application can read these WDDX packets, and convert them back into their original form. For example, a Python list could be encoded into WDDX and transmitted across HTTP to a PHP script, which could decode it into a PHP array. Or a PHP associative array could be translated into WDDX, sent to a Perl script, and decoded into a Perl- compliant hash, for use within a Perl script. Perhaps an example would make this clearer. Consider Listing 5.1, a single-line PHP script that defines a variable and assigns it a value. Listing 5.1 A Simple PHP Variable<?php $str = "amoeba"; ?> If this variable were to be converted (or, as the geeks say, "serialized" or "pickled") into its WDDX representation, it would look something like Listing 5.2. Listing 5.2 A WDDX Representation of a PHP Variable<wddxPacket version='1.0'> <header/> <data> <string>amoeba</string> </data> </wddxPacket> As you can see, this is regular XML markup, with both the variable type and its value embedded within it. The document element here is the <wddxPacket> element; it can be separated into distinct header and data areas. The header contains a human-readable comment, whereas the data block contains an XML-encoded structure representing the data to be transmitted. This WDDX representation can be decoded (or "deserialized" or "unpickled") and used by any application that understands WDDX. And so, by creating a standard framework for representing common data structures and by expressing this framework in XML, WDDX makes it possible to easily exchange information over the standard Internet backbone. WDDX consists of two parts: the WDDX specification that defines basic WDDX structures, and a set of WDDX components for different languages that handles the translation of language-specific data structures into platform-independent XML representations. As of this writing, the WDDX specification supports most commonly used data structures (see the "Not My Type" sidebar for a list), and WDDX components are available for PHP, Perl, ASP, Java, Python, JavaScript, and COM. This immediately makes the technology attractive to developers whose work involves exchanging bits and bytes in multiplatform environments. It's important to note that WDDX is not an "official" specification per se; rather, it's an open standard created and supported by one company. Despite this, it's fairly popular, primarily for use in client-server or server-server content publishing systems, or as a wrapper for data exchange between multiple programming languages. By allowing applications written in different languages to easily communicate with each other, it also opens the door to new B2B applications, particularly in the areas of streamlining business processes and transactions. For more information on WDDX, you should refer to the official web site at http://www.openwddx.org/, which contains the WDDX Document Type Definition (DTD), an SDK (if you need to implement WDDX on an unsupported platform), and usage examples.
PHP and WDDXAs previously stated, a WDDX module has been available for PHP since version 4.0 of the language. Created by Andrei Zmievski, this WDDX module includes standard serialization and deserialization functions to convert PHP variables and arrays into WDDX-compatible data structures. If you're using a stock PHP binary, it's quite likely that you'll need to recompile PHP to add support for this library to your PHP build (detailed instructions for accomplishing this are available in Appendix A, "Recompiling PHP to Add XML Support"). Encoding Data with WDDXPHP's WDDX module offers a number of different ways to encode data into WDDX. The following sections demonstrate this by using the following:
The wddx_serialize_value() FunctionThe simplest way to encode data into WDDX (and the one I will use most frequently in this chapter) is via the wddx_serialize_value() function, which is used to encode a single variable into WDDX. Listing 5.3 demonstrates its application. Listing 5.3 Serializing a Single Variable with wddx_serialize_value()<?php $flavor = "strawberry"; print wddx_serialize_value($flavor); ?> Listing 5.4 demonstrates the result. Listing 5.4 A WDDX Packet Generated via wddx_serialize_value()<wddxPacket version='1.0'> <header/> <data> <string>strawberry</string> </data> </wddxPacket>
As Listings 5.5 and 5.6 demonstrate, this works with arrays, too. Listing 5.5 Serializing a PHP Array with wddx_serialize_value()<?php $flavors = array("strawberry", "chocolate", "raspberry", "peach"); print wddx_serialize_value($flavors); ?> Listing 5.6 A WDDX Packet Representing an Array<wddxPacket version='1.0'> <header/> <data> <array length='4'> <string>strawberry</string> <string>chocolate</string> <string>raspberry</string> <string>peach</string> </array> </data> </wddxPacket> An optional second parameter to wddx_serialize_value() lets you add a human- readable comment to the resulting packet. Listing 5.7 is a variant of Listing 5.5 that demonstrates this, with the output shown in Listing 5.8.
Listing 5.7 Adding a Comment to a WDDX Packet<?php $flavors = array("strawberry", "chocolate", "raspberry", "peach"); print wddx_serialize_value($flavors, "A WDDX representation of my favorite Listing 5.8 A WDDX Packet with a Human-Readable Comment in the Header<wddxPacket version='1.0'> <header> <comment>A WDDX representation of my favorite icecream flavors</comment> </header> <data> <array length='4'> <string>strawberry</string> <string>chocolate</string> <string>raspberry</string> <string>peach</string> </array> </data> </wddxPacket> The wddx_serialize_vars() FunctionThe wddx_serialize_value() function cannot accept more than a single variable. However, it's also possible to serialize more than one variable at a time with the wddx_serialize_vars() function, which can accept multiple variables for serialization as function arguments. Listing 5.9 demonstrates how this works. Listing 5.9 Serializing Multiple Values with wddx_serialize_vars()<?php $phrase = "The game's afoot"; $animals = array("parrot" => "Polly", "hippo" => "Hal", "dog" => "Rover", Note that wddx_serialize_vars() requires the names of the variables to be serialized as string arguments. Listing 5.10 displays the result of a wddx_serialize_vars() run. Listing 5.10 A WDDX Packet Generated via wddx_serialize_vars()<wddxPacket version='1.0'> <header/> <data> <struct> <var name='phrase'> <string>The game's afoot</string> </var> <var name='animals'> <struct> <var name='parrot'> <string>Polly</string> </var> <var name='hippo'> <string>Hal</string> </var> <var name='dog'> <string>Rover</string> </var> <var name='squirrel'> <string>Sparky</string> </var> </struct> </var> </struct> </data> </wddxPacket> It's interesting to note, also, that wddx_serialize_value() and wddx_serialize_vars() generate significantly different (though valid) WDDX packets. Consider Listing 5.11, which creates a WDDX packet containing the same variable-value pair as Listing 5.3, and compare the resulting output in Listing 5.12 with that in Listing 5.4. Listing 5.11 Serializing a Single Variable with wddx_serialize_vars()<?php $flavor = "strawberry"; print wddx_serialize_vars("flavor"); ?> Listing 5.12 A WDDX Packet Generated via wddx_serialize_vars()<wddxPacket version='1.0'> <header/> <data> <struct> <var name='flavor'> <string>strawberry</string> </var> </struct> </data> </wddxPacket> The wddx_add_vars() FunctionPHP also allows you to build a WDDX packet incrementally, adding variables to it as they become available, with the wddx_add_vars() function. Listing 5.13 demonstrates this approach, building a WDDX packet from the results of a form POST operation. Listing 5.13 Building a WDDX Packet Incrementally with wddx_add_vars()<?php // create a packet handle // the argument here is an optional comment $wp = wddx_packet_start("A packet containing a list of form fields with values"); // iterate through POSTed fields // add variables to packet wddx_add_vars($wp, "HTTP_POST_VARS"); // end the packet // you can now assign the generated packet to a variable // and print it wddx_packet_end($wp); ?> This is a slightly more complicated technique than the ones described previously. Let's go through it step by step:
$wp = wddx_packet_start("A packet containing a list of form fields with _values"); This handle is used in all subsequent operations. Note that the wddx_packet_start() function can be passed an optional comment string, which is used to add a comment to the header of the generated packet. wddx_add_vars($wp, "HTTP_POST_VARS"); This function works in much the same way as wddx_serialize_vars()it accepts multiple variable names as argument (although I've only used one here), serializes these variables into WDDX structures, and adds them to the packet. Note, however, that wddx_add_vars() requires, as first argument, the handle representing the packet to which the data is to be added. wddx_packet_end($wp); Note that the wddx_packet_end() function returns the contents of the newly minted packet; this return value can be assigned to a variable and used in subsequent lines of the PHP script. This approach comes in particularly handy if you're dealing with dynamically generated data, either from a database or elsewhere. With your data now safely encoded into WDDX, let's now look at how you can convert it back into usable PHP data structures. Decoding Data with WDDXAlthough there are five different functions available to encode data into WDDX, PHP has only a single function to perform the deserialization of WDDX packets. This function is named wddx_deserialize(), and it accepts a string containing a WDDX packet as its only argument. Listing 5.14 demonstrates how a PHP variable encoded in WDDX can be deserialized by wddx_deserialize(). Listing 5.14 Deserializing a WDDX Packet into a Native PHP Structure<?php $flavor = "blueberry"; // print value before converting to WDDX echo "Before serialization, \$flavor = $flavor <br>"; // serialize into WDDX packet $packet = wddx_serialize_value($flavor); // deserialize generated packet and display value echo "After serialization, \$flavor = " . wddx_deserialize($packet); ?> This works with arrays, tooin Listing 5.15, the deserialized result $output is an array containing the same elements as the original array $stooges. Listing 5.15 Deserializing a WDDX Packet into a PHP Array<?php $stooges = array("larry", "curly", "moe"); // serialize into WDDX packet $packet = wddx_serialize_value($stooges); // deserialize generated packet $output = wddx_deserialize($packet); // view it print_r($output); ?>
A Few ExamplesSo that's the theory. There wasn't much of it, but don't let that discourage youit's possible to build some fairly powerful distributed applications using the simple functions described in the previous sections. Information Delivery with WDDX and HTTPThis section discusses one of the most popular applications of this technology, using it to build a primitive push/pull engine for information delivery over the web. I'll be using a MySQL database as the data source, WDDX to represent the data, and PHP to perform the actual mechanics of the transaction. RequirementsLet's assume the existence of a fictional corporationXTI Inc.that plans to start up an online subscription service offering access to share market data. XTI already has access to this information via an independent source, and its database of stocks and their prices is automatically updated every few minutes with the latest market data. XTI's plan is to offer customers access to this data, allowing them to use it on their own web sites in exchange for a monthly subscription fee. Listing 5.16 has a slice of the MySQL table that holds the data we're interested in. Listing 5.16 A Sample Recordset from the MySQL Table Holding Stock Market Information+--------+--------+---------------------+ | symbol | price | timestamp | +--------+--------+---------------------+ | DTSJ | 78.46 | 2001-11-22 12:20:57 | | DNDS | 5.89 | 2001-11-22 12:32:12 | | MDNC | 12.94 | 2001-11-22 12:21:34 | | CAJD | 543.89 | 2001-11-22 12:29:01 | | WXYZ | 123.67 | 2001-11-22 12:28:32 | +--------+--------+---------------------+ All that is required is an interface to this database so that subscribers to the service can connect to the system and obtain prices for all or some stocks (keyed against each stock's unique four-character symbol). Implementing these requirements via WDDX is fairly simple and can be accomplished via two simple scriptsone for each end of the connection. A WDDX server can be used at the XTI end of the connection to accept incoming client requests and deliver formatted WDDX packets to them. At the other end of the connection, WDDX-friendly clients can read these packets, decode them, and use them in whatever manner they desire. ServerLet's implement the server first. Listing 5.17 has the complete code. Listing 5.17 A Simple WDDX Server<?php // server.php - creates WDDX packet containing symbol, price and timestamp of This may appear complex, but it's actually pretty simple. Because the data is stored in a database, the first task must be to extract it using standard MySQL query functions. The returned resultset may contain either a complete list of all stocks currently in the database with their prices or a single record corresponding to a client-specified stock symbol. Next, this data must be packaged into a form that can be used by the client. For this example, I packaged the data into an associative array named $sPackage, whose every key corresponds to a stock symbol in the table. Every key is linked to a value, which is itself a two-element array containing the price and timestamp. After all the records in the resultset are processed, the $sPackage array is serialized into a WDDX packet with wddx_serialize_value() and then printed as output. ClientSo, you now have a server that is capable of creating a WDDX packet from the results of a database query. All you need now is a client to connect to this server, retrieve the packet, and decode it into a native PHP array for use on an HTML page. Listing 5.18 contains the code for this client. Listing 5.18 A Simple WDDX Client<?php // client.php - read and decode WDDX packet // this script runs at http://brutus.clientdomain.com/client.php // url of server page $url = "http://caesar.xtidomain.com/customers/server.php"; // probably implement some sort of authentication mechanism here // proceed further only if client is successfully authenticated // read WDDX packet into string $output = join ('', file($url)); // deserialize $cPackage = wddx_deserialize($output); ?> <html> <head> <basefont face="Arial"> <!-- reload page every two minutes for latest data --> <meta http-equiv="refresh" content="120; URL= http://brutus.clientdomain.com/client.php"> </head> <body> <? // if array contains data if (sizeof($cPackage) > 0) { // format and display ?> <table border="1" cellspacing="5" cellpadding="5"> <tr> <td><b>Symbol</b></td> <td><b>Price (USD)</b></td> <td><b>Timestamp</b></td> </tr> <?php // iterate through array // key => symbol // value = array(price, timestamp) while (list($key, $value) = each($cPackage)) { echo "<tr>\n"; echo "<td>$key</td>\n"; echo "<td>$value[0]</td>\n"; echo "<td>$value[1]</td>\n"; echo "</tr>\n"; } ?> </table> <? } else { echo "No data available"; } ?> </body> </html> The client is even simpler than the server. It connects to the specified server URL and authenticates itself. (I didn't go into the details of the authentication mechanism to be used, but it would probably be a host-username-password combination to be validated against XTI's customer database.) It then reads the WDDX packet printed by the server into an array with the file() function. This array is then converted into a string and deserialized into a native PHP associative array with wddx_deserialize(). After the data is decoded into a PHP associative array, a while loop can be used to iterate through it, identifying and displaying the important elements as a table. Figure 5.1 shows what the resulting output looks like. Figure 5.1 The beauty of this system is that the server and connecting clients are relatively independent of each other. As long as a client has the relevant permissions, and understands how to connect to the server and read the WDDX packet returned by it, it can massage and format the data per its own special requirements. To illustrate this, consider Listing 5.19, which demonstrates an alternative clientthis one performing a "search" on the server for a user-specified stock symbol. Listing 5.19 An Alternative WDDX Client<?php // client.php - read and decode WDDX packet // this script runs at http://brutus.clientdomain.com/client.php if(!$_POST['submit']) { ?> <!-- search page --> <!-- lots of HTML layout code - snipped out --> <form action="<? echo $_SERVER['PHP_SELF']; ?>" method="post"> Enter stock symbol: <input type="text" name="symbol" size="4" maxlength="4"> <input type="submit" name="submit" value="Search"> </form> <? } else { // perform a few error checks // sanitize search term // query server with symbol as parameter $symbol = $_POST['symbol']; $url = "http://caesar.xtidomain.com/customers/server.php?symbol=$symbol"; // probably implement some sort of authentication mechanism here // proceed further only if client is successfully authenticated // read WDDX packet into string $output = join ('', file($url)); // deserialize $cPackage = wddx_deserialize($output); // if any data in array if (sizeof($cPackage) > 0) { // format and display list($key, $value) = each($cPackage); echo "Current price for symbol $key is $value[0]"; } else { echo "No data available"; } } ?> This script consists of two parts:
Again, even though the two clients operate in two different ways (one displays a complete list of items, whereas the other uses a search term to filter down to one specific item), no change was required to the server or to the formatting of the WDDX packet.
Remote Software Updates with WDDX and Socket CommunicationThe preceding section, "Information Delivery with WDDX and HTTP," demonstrated a WDDX client and server running over HTTP. As you might imagine, though, that's not the only way to use WDDX; this next example demonstrates WDDX-based data exchange using socket communication between a PHP server and client. RequirementsIn order to set the tone, let's again consider a fictional organization, Generic Corp (GCorp), which provides its customers with Linux-based software widgets. GCorp updates these widgets on a regular basis, and makes them available to paying customers via an online repository. Now, GCorp has no fixed update schedule for these widgetsthey're handled by different development teams, and are released to the online repository as installable files in RedHat Package Manager (RPM) format at irregular intervals. What GCorp wants is a way for every customer to automatically receive notification of software updates as and when they're released. Most companies would send out email notification every time an update happened. But this is GCorp, and they like to make things complicated. What GCorp has planned, therefore, is to have a WDDX server running on its web site, which automatically scans the repository and creates a WDDX packet containing information on the latest software versions available. This information can then be provided to any requesting client. The client at the other end should have the necessary intelligence built into it to compare the version numbers received from the server with the version numbers of software currently installed on the local system. It may then automatically download and install the latest versions, or simply send notification to the system administrator about the update. This is easily accomplished with WDDX; for variety, I'll perform the data exchange over TCP/IP sockets rather than HTTP. ServerLet's begin with the server (see Listing 5.21), which opens up a socket and waits for connections from requesting clients. Listing 5.21 A WDDX Server to Read and Communicate Version Information Over a TCP/IP Socket<?php // IMPORTANT! This script should not be run via your Web server! // You will need to run it from the command line, // or as a service from inetd.conf // set up some socket parameters $ip = "127.0.0.1"; $port = 7890; // area to look for updated files $repository = "/tmp/updates/"; // start with socket creation // get a handler if (($socket = socket_create (AF_INET, SOCK_STREAM, 0)) < 0) { // this is fairly primitive error handling echo "Could not create socket\n"; } // bind to the port if (($ret = socket_bind ($socket, $ip, $port)) < 0) { echo "Could not bind to socket\n"; } // start listening for connections if (($ret = socket_listen ($socket, 7)) < 0) { echo "Could not create socket listener\n"; } // if incoming connection, accept and spawn another socket // for data transfer if (($child = socket_accept($socket)) < 0) { echo "Could not accept incoming connection\n"; } if (!$input = socket_read ($child, 2)) { echo "Could not read input\n"; } else { // at this stage, GCorp might want to perform authentication // using the input received by the client // assuming authentication succeeds... // look in the updates directory $dir = opendir($repository); while($file = readdir($dir)) { // omit the "." and ".." directories if($file != "." && $file != "..") { $info = explode("-", $file); // create an array of associative arrays, one for each I will not get into the details of how the socket server is actually createdif you're interested, the PHP manual has extensive information on thisbut instead focus on how the server obtains information on the updates available and serializes it into a WDDX packet. The variable $repository sets up the location of the online software repository maintained by GCorp's QA team. When the socket server receives an incoming connection, it obtains a file list from the repository and creates an array whose every element corresponds to a file in the repository. Every element of the array is itself an associative array that contains the keys name, version, and size, corresponding to the package name, version, and size, respectively. (Some of this information is obtained by parsing the filename with PHP's string functions.) This entire array is serialized with wddx_serialize_value() and written to the requesting client via the open socket. I used PHP to implement the server here for convenience; however, it's just as easy to use Perl, Python, or Java (as discussed in the "Perl Of Wisdom" sidebar). Note also that socket programming support was added to PHP fairly recently and is, therefore, not yet completely stable. ClientAt the other end of the connection, a WDDX-compliant client has to deserialize the packet received from the server and then compare the information within it against the information it has on locally installed versions of the software. Listing 5.22 demonstrates one implementation of such a client. Listing 5.22 A WDDX Client to Retrieve Version Information Over a TCP/ IP Socket<?php // IMPORTANT! This script should not be run via your Web server! // You will need to run it manually from the command line, or via crontab // set up some socket parameters // this is the IP address of the socket server $ip = "234.56.789.1"; $port = 7890; // open a socket connection $socket = fsockopen($ip, $port); if (!$socket) { echo "Could not open connection\n"; } else { // send a carriage return fwrite($socket, "\n"); $packet = fgets($socket, 4096); // get and deserialize list of server packages $remote_packages = wddx_deserialize($packet); // close the socket fclose($socket); // make sure that the deserialized packet is an array if(!is_array($remote_packages)) { $message= "Bad/unsupported data format received\n"; } else { // now, start processing the received data for ($x=0; $x<sizeof($remote_packages); $x++) { // for each item in the array // check to see if a corresponding package is installed on In this case, the client uses PHP's fsockopen() function to connect to the server and retrieve the WDDX packet. It then deserializes this packet into an array, and proceeds to iterate through it. Because all GCorp packages are distributed in RPM format, it's fairly simple to obtain information on the currently installed versions of the files listed in the array with the rpm utility (standard on most Linux systems). These version numbers are compared with the version numbers of files on GCorp's server (remember the version key of each associative array?), and two new arrays are created:
This information is then emailed to the system administrator via PHP's mail() function. This isn't the only option, obviously; a variant of this might be for the client to automatically download and install the new software automatically. A more sophisticated client might even identify the new packages and send advertisements for, or information on, new software available, on a per-customer basis.
Other Applications for WDDXAs the preceding examples demonstrated, WDDX makes it possible to exchange data between different sites and systems in a simple and elegant manner. Consequently, one of its more common applications involves acting as the vehicle for the syndication of frequently updated content over the web. Examples of areas in which WDDX can be used include the following: News syndication services, which "push" the latest headlines, sports scores, stock and currency market information, and weather forecasts to connecting clients from a news database (for an example, check out http://www.moreover.com/, which offers news headlines in WDDX)
SummaryThis chapter discussed WDDX, a method of creating platform-neutral data structures for information exchange over the web. It discussed PHP's WDDX support, explaining how to serialize and deserialize WDDX packets, and demonstrated how PHP and WDDX could be used to create a content distribution server for financial information updates and a software distribution network for web-based software updates. It also briefly discussed other applications of this technology for content dissemination over the web. |
copyright © 2002, Melonfire. all rights reserved.
be good. we have lawyers.