Eyecatcher Datasupport
Datasupport Sign
Charts Title
Data Support

Data Fusion

“The questionnaire is too long”

Sounds familiar? The themes and topics to be investigated are so extensive that your questionnaire is too long. The natural result is more break-offs and reduced quality of responses. But you don’t want to – or cannot – dispense with any of the information. The only realistic possibility for reducing the side-effects of an excessively long survey is to split it into two sections. But ultimately you will require one, coherent data set. We have the expertise and the appropriate algorithms to assist you in this tricky task.

Data fusion as a possible solution

Depending on the nature of the questions, the solution may be to combine the surveys by means of data fusion. Each case in the first survey (recipient) is allocated to a “statistical twin” (the nearest neighbor, i.e. the case with the greatest similarity in key questions) in the second survey (donor) and a complete data set is thus created.

Mathematical-statistical procedure

First, the connecting variables are determined, whereby the variables that flow into the calculation of similarity can be either weighted or unweighted. Then, a similarity value is calculated for each recipient and each donor. For metric questions, a small deviation only results in a small worsening in similarity. For nominal connecting variables, there are only the conditions “same” or “not same”. Sameness can also be forced for one or more connecting variables. The total similarity value is calculated from the deviations of all connecting variables. Finally, the optimal allocation is determined, i.e. the allocation that leads to the lowest possible deviation across the whole sample.

This process can also be influenced by setting a minimum similarity value. For lower similarities no allocation takes place, even if this has the result that not all recipients can be allocated to a donor. It can also be specified whether a higher weighting should be given to the best possible similarity (possibly with the multiple use of a single donor) or to maintaining the representativeness in the donor sample (all donors are used equally). When similarities are the same, a random allocation is carried out.

We can assist you

If you have any questions about data fusion, or the possibilities and limits of this interesting procedure, our experts will be happy to consult with you.


x
Kontakt

Martin Klein
Tel.: +49 40 25 17 13 - 18
E-Mail:

Norbert Schroeder
Tel.: +49 40 25 17 13 - 19
E-Mail: