Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

To calculate this metric the respective metric kernel needs the current NumberOfErrors and the OverallLinesOfCode. Often, lines of code are provided by a code analysis tool like Sonar. However, these systems typically only count lines of code on a component or build fragment level. To calculate the overall ErrorDensity another metric kernel needs to sum up all the lines of code values from all the build fragments or components to provide a new derived measure OverallLinesOfCode. The NumberOfErrors could be calculated by a CRM metric kernel. This example shows the need of a circular data flow between different metric kernels.

Image Added

Fig. 2. Measurement and data flow in the EMI

The Measurement Cache located in the Calculation and Storage Layer is a central infrastructure component. It stores all measurement values so they are immediately accessible for visualization components. This also allows the visualization
components visualization components to directly access base measures if needed. However, the tradeoff of this architectural decision is that the visualization components have to use the stored values.

...

The Enterprise Measurement Data Bus (EMDB), an implementation
of implementation of an Enterprise Service Bus ([15], [22]),
needs  needs to transport the measurement values. Either from a
Data a Data Adapter (Base Measure) or from a Metric Kernel (Derived
MeasureDerived Measure) to all the Metric Kernels and the Measurement
Cache Measurement Cache of the system.

Image Added

Fig.3. Concept of the EMDB (Notation by Chappel [5])

The main concept of the EMDB, a publish/subscribe
channelsubscribe channel, is depicted in Fig. 3. It also shows two Data Adapters
Adapters (as generic endpoints) and a Metric Kernel (as a Java API
clientAPI client). The messages that are broadcasted over the channel
are channel are of type EMDB Message (or a subtype of this). The next
section next section describes these messages types.C.

EMDB Measurement Messages

The main design principles for the EMI are separation of
concern of concern and loose coupling. Hence, metric kernels and data
adapters data adapters need to be completely separated. A metric kernel
just kernel just needs to know what measures it requires for its calculations.
The  The data provider just provides specific measures
measures (values) for specific entities (Entities of Measurement
EOMs– EOMs). Consequently the messages send over the EMDB
need EMDB need to inherit from a general EMDB Message type (see Fig.
4 4) which just defines three important attributes:

metricRefId represents the identifier (name) of the
measurethe measure. We propose using a name space schema for the
identifiers the identifiers like
{globalNamespace}.{msgClass}. {msgSubClass}*.{metric}.
An example would be emi.crm.NumberOfErrors or emi.ev.ev
or  or emi.ev.pv as well as emi.ev.cv for the Earned Value
Analyis Value Analyis metrics earned value, planed value, and cost variance.
This  This identifier is used by the metric kernels to filter the
EMDB the EMDB messages according to their measurement requirements.

eomId is the identifier of the EOM. It is used to provide
a provide a brought variety of measures for the same entity. The data
providers data providers typically use an internal eomId from the base systems
which systems which they adapt. The metric kernels typically reuse the
eomIds the eomIds from the base measures. The domain synonym repository
in repository in the operations layer can be used to build groups
of groups of eomIds. This is necessary if different systems which are
adapted are adapted to the EMI use different identifiers for the same
business same business entity.

value represents the actual measurement value. It is designed
as designed as a string to allow a brought variety of values to be
transported be transported over the bus instead of just numerical values.Pub/Sub
JAVA
Metric
Kernel
Data
Adapter 1
Data
Adapter 2
EMDB
Message
Fig. 3. Concept of the EMDB (Notation by Chappell [5])
Provide Base
Measure
Store Necessary
Data
Visualize
Measure
Calculate
Derived
Metric Kernel
Mesurement Cache
Data Adapter
Derived
Measures
Store Measure
EMDB
Visualization
Indicators
Base
Measures
Fig. 2. Measurement and data flow in the EMI
Fig. 4 depicts the general EMDB Message with two specialized
messages (CRM Message and VCS Message). The
general EMDB Message can be extended by every data
adapter or metric kernel that is connected to the EMDB to
form specific messages that include additional information
required by specific metric kernels. In general the data provider
should include as much additional information with the
message as possible to give the metric kernels as much additional
information (for example for filtering) as possible. The
CRM Message for example requires additional ticketId and
status attributes. This information is useful for specialized
metric kernels

like our RIFFLE3 Kernel which analyses and
Image Added

Fig. 4. Base message type hierarchy

Fig. 4 depicts the general EMDB Message with two specialized messages (CRM Message and VCS Message). The general EMDB Message can be extended by every data adapter or metric kernel that is connected to the EMDB to form specific messages that include additional information required by specific metric kernels. In general the data provider should include as much additional information with the message as possible to give the metric kernels as much additional information (for example for filtering) as possible. The CRM Message for example requires additional ticketId and status attributes. This information is useful for specialized metric kernels like our RIFFLE3 Kernel which analyses and provides ticket flows from CRM systems.D.

Data Provision

...

Mechanisms

The heterogeneity of the systems that are integrated into
the into the infrastructure calls for flexible data provision mechanisms.
We  We investigated three core provision concepts: Push-
Forward, Pull-Forward, and Invoke-Push. We describe the
main the main ideas and possible application scenarios in the following
subsectionsfollowing subsections.1)

Push-Forward

Image Added

Fig. 5. Concept of the Push Forward data provision mechanism

The Push-Forward data provision mechanism guarantees
the guarantees the best latency between change event in the adapted system
3 The RIFFLE Metric Kernel can use this information to identify unique
tickets and provide status flows for the RIVER visualization tool to allow
a detailed analysis of flows in CRM systems.
and system and the visualization. The sequence diagram in Fig. 5 shows
the shows the flow of interactions. Because a plug-in mechanism in the
adapted the adapted system is needed, a custom build EMI plug-in is
then is then able to hook onto the desired change events in the
adapted the adapted system. The system calls the plug-in on every data
change data change event. Then, the plug-in creates a (specialized)
EMDB  EMDB message and adds specific data to the message. The
message The message is send to the EMDB using a standard JMS Message
GatewayMessage Gateway. The data is then transported to the metric
kernels metric kernels and the measurement cache. Hence, the visualization
components visualization components could immediately update the visualizations to
reflect to reflect the new data.
2)

Pull-Forward

Image Added

Fig. 6. Concept of the Pull-Forward data provision mechanism

Standard BI (Business Intelligence) systems use scheduled
jobs scheduled jobs (called ETL – Extract Transform Load) to derive
data derive data from adapted systems. The Pull-Forward data provision
mechanism provision mechanism is inspired by these ETL jobs. Fig. 6 shows the
sequence the sequence of messages. The needed EMI Extract Tasks are
triggered are triggered by a scheduler who is configured to a certain interval
like interval like every minute, hour, or day. The tasks then retrieve
the retrieve the changed data from the systems. It should then extract the
unique the unique data chunks from the retrieved data and create a message
for message for every chunk which is then send like push forward.

Even though this provision mechanism is inspired by the
most the most popular mechanism – ETL – it has some strong weaknesses.
The  The most important one is latency which increases
dramaticallyincreases dramatically. As a result the data in the visualization is only
as only as up to date as the latest pull interval. One solution would
be would be to reduce the pull intervals to a minimum. However, pulling
data pulling data from a system typically generates a high load in the
systemthe system. Therefore, shortening the intervals will lead to performance
degeneration performance degeneration in the adapted systems. Another
weakness Another weakness of this solution is the increased effort to implement
the implement the data providers.EMI Scheduler System
EMDB Message
EMDB Message
Gateway
On Timer
EMI Extract Task
extract
Get changed data
Add Data
Send Message (EMDB Message)
loop

Invoke-Push

Image Added

Fig. 67. Concept of the PullInvoke-Forward Push data provision mechanismSystem EMI Plug-In
EMDB Message
On Data Change
EMDB Message
Gateway


Send Message (EMDB Message)
Add Data
Change Data
Fig. 5. Concept of the Push-Forward data provision mechanism
metricRefId : String
eomId : String
value : String
EMDB Message
ticketId : String
status : String
CRM Message
changedFiles : String [1..*]
VCS Message
...
...
...
Fig. 4. Base message type hierarchy
3) Invoke-Push
The data stored in the adapted systems typically relate to
each other. For example a good practice in software development
is to tag a commit into a version control system
(VCS) with the task number of a task in a change request
management system (CRM). The number of changed files
per task could be used as a complexity measure for the task.
Additionally, the number of changed lines of code could be
used to normalize the effort for a task. Of course, every
commit alters the number of changed files for a task. Hence,
after every commit a special data adapter needs to send a
new message to the EMDB containing additional information
to the task. This then allows a special metric kernel to
calculate the two measures.
Fig. 7 shows the sequence diagram of the Invoke-Push
data provision mechanism which enables EMI developers to
implement a special data adapter for the described situation.
A special EMDB Message Listener is invoked whenever it
receives a certain type of message. It then pulls data from the
adapted system (for example the task from the CRM). The
data is then packed into a new (specialized) EMDB message
and pushed to the EMDB. This mechanism also enables a
combination of Push-Forward and Pull-Forward. For example,
a VCS message could be used to pull data from a
changed spreadsheet file in the VCS.
E. Communication between Metric Kernels and
Visualization
The measurement customer typically would like to alter
some details in the metric calculation to answer more detailed
or slightly tailored questions. For example the question
“Are we able to address all bugs?” could be answered by the
number of open bugs in a CRM system. If the project is
closing in to a release date this question is typically slightly
tailored to the question “Are we able to address all important
bugs?” which is answered by the number of open bugs in the
top categories (priority one and two).
A dashboard should allow a tailoring for these specific
situations. The change in the measurement needs to be reflected
by the metric kernel. This could either provide both
of these metrics or the visualization component could talk
directly with the metric kernel and alter the calculation of the
specific metric which is feeding a certain diagram. There
exist good arguments for both solutions. Hence, the EMI
should allow both solutions.
Two metrics could be easily implemented in a specific
metric kernel and could then feed the results back to the
EMDB to allow a dashboard to access the values via the
measurement cache. This solution is very elegant because it
only requires the dashboard to fetch the data from the measurement
cache. However, it generates additional effort in the
implementation of the metric kernel because this needs to
generate more derived measures. Additionally, it can lead to
an explosion in the number of metrics which are communicated
over the EMDB which could lead to difficulties in the
maintenance and operation of the EMI. Also, this makes the
measurement cache a central part in the EMI which contradicts
the idea of a federalist infrastructure.
The direct communication from a dashboard to a metric
kernel requires additional communication flows in the EMI
(the control arrows in Fig. 1). This also increases the complexity
in the configuration of the dashboard because it now
needs to take the (service) source of a metric into account.
However, these problems can be solved by a good and flexible
framework for the communication between the metric
kernels and the visualization components. We propose a
solution in which the metric kernels and the dashboard can
exchange instances of variability models for each metrics.
These variability models include the variability points and
variants for each metric. The measurement customer can
then change these variability points and tailor the metrics to
her specific needs.
VI. EMI PROTOTYPE AND EVALUATION
First prototypes for EMI components and frameworks
were developed in several thesis as part of multiple industry
cooperation projects [23], [24], [25]. Most importantly the
dashboard tool SCREEN and several data adapters4 and
metric kernels are based on the EMI. SCREEN was successfully
deployed and integrated into the software development
processes and infrastructures at small and medium sized
companies (the results of these field studies are published
separately). We are currently planning to integrate it into
larger companies with more than 250 employees (one with
over 1.200). Also, we are currently starting to integrate
SCREEN (and the EMI) into the software development infrastructure
used by over 700 research projects at RWTH
Aachen University.
Our change request analysis metric kernel RIFFLE and
the visualization tool RIVER are also based on the EMI.
These tools proved to be very useful to analyze CRM data
(details about the tools and analysis will be published separately).
Additionally, they helped to research the performance
of the complete EMI. Our simulations show that a
JMS based EMDB implementation and EJB/JPA based metric
kernels are able to operate with over 1.500 (CRM) mes-
4 Until April 2013 we developed Data Adapters for: TRAC, Redmine,
JiRA, git, svn, Excel, ClearQuest CSV dumps, Hudson, Jenkins,
SONAR, generic REST, generic SOAP
EMDB Message
Listener
System
EMDB Message
EMDB Message
Gateway
On Message
Get data
Add Data
Send Message (EMDB Message)
Fig. 7. Concept of the Invoke-Push data provision mechanism
sages per second and an average of 1.000 messages per se-cond on a standard notebook running a glassfish application server with OpenMQ. This allows the tools to import5 ClearQuest CSV dumps with over 28.500 tickets in under 25 seconds which is great for development. Our predictions are that a productive environment with dedicated message bus server(s) can dramatically increase these numbers. Hence, we do not think that the EMDB will become a performance bottleneck like feared by some of our industry partners.
We are currently working on the operation components and on the framework for the variability exchange between metric kernels and the visualizations components. We are also working on several (generic) metric kernels and on several additional data adapters. All the work on the EMI implementation, metric kernels, data adapters, and visualiza-tion components in the last year showed the strengths of the infrastructure. The strong separation of concerns due to the federalist design helped to streamline the development in several simultaneous projects.
VII. CONCLUSION
In this paper we proposed an Enterprise Measurement In-frastructure (EMI) which is based on best practices of service oriented architectures. The EMI is based on a set of federalist systems rather than on a centralistic system to measure, ana-lyze and visualize different data. This design decision proved to work really well in different implementation scenarios. In addition, the different parts of the EMI are well aligned with the business needs of measurement customers like proposed in the application scenario in part II.
The most important (measurement) parts of the EMI are the data flow and the data provision mechanisms. The flexi-ble data flow together with separated metric kernels helped to implement different EMI prototypes for our field studies in parallel. We strongly belief that we are now able to inte-grate all the different solutions into a large toolbox that helps to address upcoming integration problems in new field stud-ies.
Even though the intermediate results until now are very promising we still need to prove that the EMI is as maintain-able and flexible as desired. Unfortunately, to answer this question we need to have EMI installations running in busi-ness contexts over a long period of time. Luckily, we already have some installations running and we are currently plan-ning larger installations. This will help us to get valid results about the maintainability and performance of the EMIThe data stored in the adapted systems typically relate to each other. For example a good practice in software development is to tag a commit into a version control system (VCS) with the task number of a task in a change request management system (CRM). The number of changed files per task could be used as a complexity measure for the task. Additionally, the number of changed lines of code could be used to normalize the effort for a task. Of course, every commit alters the number of changed files for a task. Hence, after every commit a special data adapter needs to send a new message to the EMDB containing additional information to the task. This then allows a special metric kernel to calculate the two measures.

Fig. 7 shows the sequence diagram of the Invoke-Push data provision mechanism which enables EMI developers to implement a special data adapter for the described situation. A special EMDB Message Listener is invoked whenever it receives a certain type of message. It then pulls data from the adapted system (for example the task from the CRM). The data is then packed into a new (specialized) EMDB message and pushed to the EMDB. This mechanism also enables a combination of Push-Forward and Pull-Forward. For example, a VCS message could be used to pull data from a changed spreadsheet file in the VCS. 

 Communication between Metric Kernels and Visualization

The measurement customer typically would like to alter some details in the metric calculation to answer more detailed or slightly tailored questions. For example the question “Are we able to address all bugs?” could be answered by the number of open bugs in a CRM system. If the project is closing in to a release date this question is typically slightly tailored to the question “Are we able to address all important bugs?” which is answered by the number of open bugs in the top categories (priority one and two).

A dashboard should allow a tailoring for these specific situations. The change in the measurement needs to be reflected by the metric kernel. This could either provide both of these metrics or the visualization component could talk directly with the metric kernel and alter the calculation of the specific metric which is feeding a certain diagram. There exist good arguments for both solutions. Hence, the EMI should allow both solutions. 

Two metrics could be easily implemented in a specific metric kernel and could then feed the results back to the EMDB to allow a dashboard to access the values via the measurement cache. This solution is very elegant because it only requires the dashboard to fetch the data from the measurement cache. However, it generates additional effort in the implementation of the metric kernel because this needs to generate more derived measures. Additionally, it can lead to an explosion in the number of metrics which are communicated over the EMDB which could lead to difficulties in the maintenance and operation of the EMI. Also, this makes the measurement cache a central part in the EMI which contradicts the idea of a federalist infrastructure.

The direct communication from a dashboard to a metric kernel requires additional communication flows in the EMI (the control arrows in Fig. 1). This also increases the complexity in the configuration of the dashboard because it now needs to take the (service) source of a metric into account. However, these problems can be solved by a good and flexible framework for the communication between the metric kernels and the visualization components. We propose a solution in which the metric kernels and the dashboard can exchange instances of variability models for each metrics. These variability models include the variability points and variants for each metric. The measurement customer can then change these variability points and tailor the metrics to her specific needs.

REFERENCES

[1] C. P. Team, “CMMI® for Development, Version 1.3 CMMI-DEV, V1.3,” 2010.
[2] S. Few, Information Dashboard Design: The Effective Visual Communication of Data. O’Reilly Media, Inc., 2006.
5 Importing the tickets includes reading the CSV file, parsing the data, generating messages for every ticket, sending the messages over the EMDB, receiving the messages in RIFFLE, interpreting the messages, and storing the data in the internal database of RIFFLE, Inc., 2006.
[3] M. Kunz, A. Schmietendorf, R. R. Dumke, and C. Wille, “Towards a service-oriented measurement infrastructure,” in Proc. of the 3rd Software Measurement European Forum (SMEF), 2006, pp. 197–207.
[4] S. Architecture, “Combining Service-Oriented Architecture and Event-Driven Architecture using an Enterprise Service Bus,” no. April, pp. 1–8, 2006.
[5] D. A. Chappell, Enterprise service bus, 1st ed. O’Reilly Media, Inc., 2004.
[6] R. R. Dumke, “Software-Messung und -Bewertung - Eine Bilanz.” 2012.
[7] K. Umapathy, S. Purao, and R. R. Barton, “Designing enterprise integration solutions: effectively,” European Journal of Information Systems, vol. 17, no. 5, pp. 518–527, 2008.
[8] S. Aier and R. Winter, “Fundamental Patterns for Enterprise Integration Services,” International Journal of Service Science Management Engineering and Technology IJSSMET, vol. 1, no. 1, pp. 33–49, 2010.
[9] T. Puschmann and R. Alt, “Enterprise Application Integration - The Case of the Robert Bosch Group,” vol. 00, no. c, pp. 1–10, 2001.
[10] R. D?browski, K. Stencel, and G. Timoszuk, “Software is a directed multigraph,” Software Architecture, pp. 360–369, 2011.
[11] H. Wache and T. Voegele, “Ontology-based integration of information-a survey of existing approaches,” IJCAI--01 Workshop: Ontologies and Information Sharing, pp. 108–117, 2001.
[12] P. a. Bernstein and E. Rahm, “A survey of approaches to automatic schema matching,” The VLDB Journal, vol. 10, no. 4, pp. 334–350, Dec. 2001.
[13] S. Chaudhuri and U. Dayal, “An overview of data warehousing and OLAP technology,” ACM Sigmod record, no. March 1997, 1997.
[14] M. P. P. and D. Georgakopoulos, M. P. Papazoglou, and D. Georgakopoulos, “Service-Oriented Computing,” Communications of the ACM, vol. 46, no. 10, pp. 24–28, 2003.
[15] M. P. Papazoglou, P. Traverso, S. Dustdar, and F. Leymann, “Service-Oriented Computing: a Research Roadmap,” International Journal of Cooperative Information Systems, vol. 17, no. 02, pp. 223–255, Jun. 2008.
[16] M. Kunz, A. Schmietendorf, R. R. Dumke, and C. Wille, “Towards a service-oriented measurement infrastructure,” pp. 197–207.
[17] A. Halevy, N. Ashish, and D. Bitton, “Enterprise information integration: successes, challenges and controversies,” in Proceedings of the 2005 ACM SIGMOD international conference on Management of data, 2005, pp. 778–787.
[18] R. Kishore, H. Zhang, and R. Ramesh, “Enterprise integration using the agent paradigm: foundations of multi-agent-based integrative business information systems,” Decision Support Systems, vol. 42, no. 1, pp. 48–78, Oct. 2006.
[19] M. Wooldridge, “Intelligent Agents: The Key Concepts,” in Proceedings of the 9th ECCAI-ACAI/EASSS 2001, AEMAS 2001, HoloMAS 2001 on Multi-Agent-Systems and Applications II-Selected Revised Papers, 2002, pp. 3–43.
[20] S. J. Russell, P. Norvig, J. F. Candy, J. M. Malik, and D. D. Edwards, Artificial intelligence: a modern approach, Third Edit. Upper Saddle River, NJ, USA: Prentice-Hall, Inc., 2010.
[21] R. R. Dumke, R. Koeppe, and C. Wille, Software Agent Measurement and Self-Measuring Agent-Based Systems. 2000, pp. 1–44.
[22] M.-T. Schmidt, B. Hutchison, P. Lambros, and R. Phippen, “The Enterprise Service Bus: Making service-oriented architecture real,” IBM Systems Journal, vol. 44, no. 4, pp. 781–797, 2005.
[23] A. Steffens, “Entwurf eines Architekturmodells zur Integration heterogener Systeme in MeDIC,” 2013.
[24] F. Evers, “Konzeptionelle Erweiterung von Projektdashboards für unerfahrene Anwender,” 2012.
[25] C. Hans, “Einsatz von Metrik-Dashboards im industriellen Umfeld,” RWTH Aachen University, 2012.